Juniper NETWORKS- logoTelemetry muJunos yeAI/ML Workloads
Munyori: Shalini Mukherjee

Nhanganyaya

Sezvo AI cluster traific inoda kurasikirwa netiweki ine yakakwira throughput uye yakaderera latency, chinhu chakakosha cheAI network kuunganidzwa kwe data rekutarisa. Junos Telemetry inogonesa kutarisisa kwegranular kweakakosha maratidziro ekuita, kusanganisira zvikumbaridzo uye zviverengero zvekugadzirisa congestion uye traffc mutoro kuenzanisa. gRPC zvikamu zvinotsigira kutenderera kwetelemetry data. gRPC inguva yemazuva ano, yakavhurika-sosi, yakakwira dhizaini inovakwa paHTTP/2 yekufambisa. Inopa masimba ekuzvarwa bidirectional yekufambisa hunyanzvi uye inosanganisira inochinjika tsika-metadata mune yekukumbira misoro. Nhanho yekutanga mu telemetry ndeye kuziva kuti ndeipi data inofanirwa kuunganidzwa. Tinogona ipapo kuongorora iyi data mumhando dzakasiyana. Kana tangounganidza data, zvakakosha kuti tiratidze muchimiro chiri nyore kutarisa, kuita sarudzo uye kugadzirisa sevhisi iri kupihwa. Mubepa rino, tinoshandisa telemetry stack inosanganisira Telegraf, InfluxDB, uye Grafana. Iyi telemetry stack inounganidza data uchishandisa push modhi. Mamodheru ekukwevera echinyakare anonyanya kushandisa zviwanikwa, anoda kupindira kwemanyorero, uye anogona kusanganisira mabheji eruzivo mune data raanounganidza. Push modhi inokunda izvi zvipimo nekuendesa data asynchronously. Vanopfumisa iyo data nekushandisa mushandisi-ane hushamwari tags uye mazita. Kana iyo data iri mune inoverengeka fomati, tinoichengeta mudhatabhesi uye tinoishandisa mune inopindirana yekuona web application yekuongorora network. Mufananidzo. 1 inotiratidza kuti stack iyi yakagadzirirwa sei kunyatso kuunganidza, kuchengetedza, uye kuona, kubva kunetiweki zvishandiso zvinosundira data kumuunganidzi kune iyo data inoratidzwa pamadhibhodhi kuti aongororwe.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software -

TIG Stack

Isu takashandisa Ubuntu sevha kuisa software yese kusanganisira iyo TIG stack.

Telegraph
Kuti titore data, tinoshandisa Telegraf pane Ubuntu server inomhanya 22.04.2. Iyo Telegraf vhezheni inoshanda mune iyi demo ndeye 1.28.5.
Telegraf ndeye plugin inofambiswa sevha mumiriri wekuunganidza uye kushuma metrics. Inoshandisa processor plugins kupfumisa uye kugadzirisa data. The output plugins anoshandiswa kutumira iyi data kuzvitoro zvakasiyana siyana. Mugwaro iri tinoshandisa maviri plugins: imwe yeopenconfig sensors uye imwe yeJuniper native sensors.
InfluxDB
Kuchengeta iyo data mune yenguva yakatevedzana dhatabhesi, isu tinoshandisa InfluxDB. Iyo inoburitsa plugin muTelegraf inotumira iyo data kuInfluxDB, iyo inoichengeta nenzira ine hunyanzvi. Tiri kushandisa V1.8 sezvo pasina CLI iripo yeV2 uye pamusoro.
Grafana
Grafana inoshandiswa kuona iyi data. Grafana inodhonza data kubva kuInfluxDB uye inobvumira vashandisi kugadzira akapfuma uye anodyidzana dashboard. Pano, tiri kumhanya shanduro 10.2.2.

Configuration On The Switch

Kuti tishandise stack iyi, tinoda kutanga tagadzirisa shanduko sezvakaratidzwa mumufananidzo 2. Tashandisa port 50051. Chero chipi nechipi chinogona kushandiswa pano. Pinda mukati QFX switch uye wedzera iyo inotevera gadziriso.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Shandura

Cherechedza: Iyi gadziriso ndeyeLabs/POCs sezvo password ichifambiswa mune yakajeka mavara. Shandisa SSL kudzivirira izvi.

Environment

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nzvimbo

Nginx
Izvi zvinodikanwa kana iwe usingakwanise kuburitsa pachena chiteshi chakatambirwa Grafana. Nhanho inotevera ndeyekuisa nginx paUbuntu server kuti ishande sereverse proxy agent. Kana nginx yaiswa, wedzera mitsetse inoratidzwa muMufananidzo 4 kune "default" faira uye fambisa faira kubva /etc/nginx kuenda /etc/nginx/sites-enabled.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx1

Ita shuwa kuti firewall yakagadziridzwa kuti ipe mukana wakazara kune nginx sevhisi sezvakaratidzwa muFigure 5.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx2

Kana nginx yaiswa uye shanduko dzinodiwa dzaitwa, isu tinofanirwa kukwanisa kuwana Grafana kubva ku web browser nekushandisa IP kero yeUbuntu server uko software yese yakaiswa.
Pane diki glitch muGrafana iyo isingakutendere iwe kuseta zvakare default password. Shandisa matanho aya kana iwe ukasangana nedambudziko iri.
Matanho anofanirwa kuitwa paUbuntu server kuseta password muGrafana:

  • Enda ku /var/lib/grafana/grafana.db
  • Isa sqllite3
    o sudo apt kuisa sqlite3
  • Mhanya uyu murairo pane yako terminal
    o sqlite3 grafana.db
  •  Sqlite command prompt inovhura; mhanya mubvunzo unotevera:
    > bvisa kubva kumushandisi uko login = 'admin'
  • Tangazve grafana uye nyora admin sezita rekushandisa uye password. Inokumbira password itsva.

Kana software yese yaiswa, gadzira iyo config faira muTelegraf iyo inozobatsira kudhonza iyo telemetry data kubva pane switch uye kuisundira kuInfluxDB.

Openconfig Sensor Plugin

PaUbuntu server, gadzirisa /etc/telegraf/telegraf.conf faira kuti uwedzere zvese zvinodiwa. plugins uye sensors. Kune maopenconfig sensors, tinoshandisa gNMI plugin inoratidzwa muFigure 6. Nezvinangwa zvedemo, wedzera zita remugamuchiri se "spine1", nhamba yechiteshi "50051" inoshandiswa gRPC, zita rekushandisa nepassword ye switch, uye nhamba. kwemasekonzi ekudaidzira zvakare kana yatadza.
Muchikamu chekunyorera, wedzera zita rakasiyana, "cpu" yeiyi sensor, nzira ye sensor, uye nguva yekubata iyi data kubva pachinja. Wedzera zvakafanana plugin inputs.gnmi uye inputs.gnmi.subscription kune ese akavhurika masensa masensa. (Mufananidzo 6)

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx3

Native Sensor Plugin

Iyi iJuniper telemetry interface plugin inoshandiswa kune vemunharaunda masensa. Mufaira rimwe chete retelegraf.conf, wedzera iyo yemuno sensor plugin inputs.jti_openconfig_telemetry apo minda yakada kufanana neye openconfig. Shandisa yakasarudzika mutengi ID kune yega sensor; pano, tinoshandisa "telegraf3". Zita rakasiyana-siyana rinoshandiswa pano kune iyi sensor ndeye "mem" (Mufananidzo 7).

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx4

Chekupedzisira, wedzera inobuda plugin outputs.influxdb kutumira iyi sensor data kuInfluxDB. Pano, database inonzi "telegraf" ine zita rekushandisa se "influx" uye password "influxdb" (Mufananidzo 8).

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx5

Kana uchinge wagadzirisa faira retelegraf.conf, tangazve sevhisi yerunhare. Zvino, tarisa muInfluxDB CLI kuti uone kana zviyero zvakagadzirirwa ese akasarudzika sensors. Nyora "influx" kuti upinde muInfluxDB CLI.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx6

Sezvinoonekwa muMufananidzo. 9, pinda iyo influxDB kukurumidza uye shandisa dhatabhesi "telegraf". Mazita ese akasarudzika akapihwa masensa akanyorwa sezviyero.
Kuti uone kubuda kwechero chiyero chimwe chete, kungoita chokwadi chekuti telegraf faira rakarurama uye sensor iri kushanda, shandisa murairo wekuti "sarudza * kubva cpu muganhu 1" sezvakaratidzwa mumufananidzo 10.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx7

Pese panoitwa shanduko kune telegraf.conf faira, ita shuwa kumisa InfluxDB, tangazve Telegraf, uye wobva watanga InfluxDB.
Pinda kuGrafana kubva kubrowser uye gadzira dashboard mushure mekuona kuti data iri kuunganidzwa nemazvo.
Enda kuZvibatanidza> InfuxDB> Wedzera nyowani data sosi.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx8

  1. Ipa zita kune iyi data source. Mune iyi demo ndeye "test-1".
  2.  Pasi peiyo HTTP stanza, shandisa iyo Ubuntu server IP uye 8086 port.
    Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx9
  3. Mune iyo InfluxDB dhata, shandisa iro rimwechete dhatabhesi zita, "telegraf," uye ipa zita rekushandisa uye password yeUbuntu server.
  4. Dzvanya Save & test. Ita shuwa kuti iwe unoona meseji, "yakabudirira".
    Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Nginx10
  5. Kana iyo data sosi yawedzerwa zvinobudirira, enda kune Dashboards uye tinya Nyowani. Ngatigadzirei mashoma madhibhodhi akakosha kune AI/ML basa rekutakura mune edhita modhi.

ExampZvikamu zveSensor Grafu

Vanotevera ndeva exampmashoma emamwe macounter makuru akakosha pakutarisa AI/ML network.
Percentage kushandiswa kweiyo ingress interface et-0/0/0 pamuzongoza-1
Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Magirafu

  • Sarudza iyo data sosi seyedzo-1.
  • Muchikamu cheFROM, sarudza kuyerwa se "interface". Iri ndiro zita rakasiyana rinoshandiswa kune iyi sensor nzira.
  • Muchikamu cheWHERE, sarudza mudziyo::tag,uye mu tag kukosha, sarudza zita rekutambira rekuchinja, kureva, musana1.
  • Muchikamu cheSARUDZA, sarudza iyo sensor bazi raunoda kutarisa; munyaya iyi sarudza "munda(/interfaces/interface[if_name='et-0/0/0']/state/counters/if_in_1s_octets)". Iye zvino muchikamu chimwe chete, tinya pa "+" uye wedzera masvomhu aya ekuverenga (/ 50000000000 * 100). Isu tiri kuverengera muzanatage kushandiswa kwe400G interface.
  • Ita shuwa kuti FORMAT ndeye "nguva-yakatevedzana," uye zita girafu muchikamu cheALIAS.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Graphs1Peak buffer occupancy kune chero mutsetse

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Graphs2

  • Sarudza iyo data sosi seyedzo-1.
  • Muchikamu cheFROM, sarudza kuyerwa se "buffer."
  • Muchikamu cheWHERE, pane minda mitatu yekuzadza. Sarudza mudziyo::tag,uye mu tag kukosha sarudza zita rekutambira rekuchinja (kureva musana-1); UYE sarudza /cos/interfaces/interface/@zita ::tag uye sarudza iyo interface (kureva et- 0/0/0); UYE sarudzawo mutsara, /cos/interfaces/interface/queues/queue/@queue::tag uye sarudza mutsara nhamba 4.
  • Muchikamu cheSARUDZA, sarudza iyo sensor bazi raunoda kutarisa; munyaya iyi sarudza "munda(/cos/interfaces/interface/queues/queue/PeakBufferOccupancy)."
  • Ita shuwa kuti FORMAT ndeye "nguva-yakatevedzana" uye zita girafu muchikamu cheALIAS.

Iwe unogona kuunganidza data kune akawanda mainterfaces pane imwecheteyo girafu sezvinoonekwa muMufananidzo 17 we et-0/0/0, et-0/0/1, et-0/0/2 nezvimwe.

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Graphs3

PFC uye ECN zvinoreva kubva
Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - yakabva

Kuti uwane zvinoreva kubva (musiyano mukukosha mukati menguva yakatarwa), shandisa nzira yekubvunza.
Uyu ndiwo mubvunzo wekupinda mukati watakashandisa kuwana zvinoreva kubva pakati pembiri PFC tsika pa et-0/0/0 yeSpine-1 muchikamu.
SARUDZA kubva (zvinoreva(“/interfaces/interface[if_name='et-0/0/0′]/state/pfc-counter/tx_pkts”), 1s) KUBVA ku“interface” PANO (“mudziyo”:tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Saizvozvo kune ECN

SARUDZA zvinobuda (zvinoreva(“/interfaces/interface[if_name='et-0/0/8′]/state/error-counters/ecn_ce_marked_pkts”), 1s) KUBVA ku“interface” KUPI (“mudziyo”::tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Saizvozvo kune ECN1

Input resource kukanganisa zvinoreva kubva

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Saizvozvo kune ECN2

The raw query yekukanganisa resource zvinoreva kubva kune:
SARUDZA zvinobuda(zvinoreva(“/interfaces/interface[if_name='et-0/0/0′]/state/error-counters/if_in_resource_errors”), 1s) KUBVA ku“interface” PAPI (“mudziyo”:tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Saizvozvo kune ECN3

Madonhwe emuswe anoreva kubva

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - Saizvozvo kune ECN4

The raw query yemadonhwe emuswe zvinoreva kubva kune:
SARUDZA zvinobuda (zvinoreva(“/cos/interfaces/interface/queue/queue/tailDropBytes”), 1s) KUBVA ku“bu” PAPI (“mudziyo”::tag = 'Leaf-1' UYE “/cos/interfaces/interface/@zita”::tag = 'et-0/0/0' UYE “/cos/interface/interface/queues/queue/@queue”::tag = '4') UYE $timeFilter GROUP BY time($__interval) zadza(null)
 CPU kushandiswa

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - CPU kushandiswa

  • Sarudza iyo data sosi seyedzo-1.
  • Muchikamu cheFROM, sarudza kuyerwa se "newcpu"
  • MuIKO, pane minda mitatu yekuzadza. Sarudza mudziyo::tag uye mu tag kukosha sarudza zita rekutambira rekuchinja (kureva musana-1). UYE mu/zvikamu/chikamu/properties/property/zita:tag, uye sarudza cpuutilization-total AND muzita ::tag sarudza RE0.
  • Muchikamu cheSARUDZA, sarudza bazi re sensor raunoda kutarisa. Muchiitiko ichi, sarudza "munda (nyika / kukosha)".

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - CPU kushandiswa1

Muvhunzo wekutsvaga kusiri-negative kubva kumuswe kunodonha kune akawanda maswichi pamadiresi akawanda mumabits/sec.
SARUDZA zvisiri_negative_derivative(zvinoreva(“/cos/interfaces/interface/queue/queue/tailDropBytes”), 1s)*8 KUBVA “bufer” PASI (mudziyo::tag =~ /^Spine-[1-2]$/) uye (“/cos/interfaces/interface/@zita”::tag =~ /et-0\/0\/[0-9]/ kana “/cos/interfaces/interface/@name”::tag=~/et-0\/0\/1[0-5]/) UYE $timeFilter GROUP BY time($__interval),mudziyo::tag zadza (null)

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - CPU kushandiswa2

Ava vaive vamwe ve exampmashoma emagirafu anogona kugadzirwa kuti atarise AI/ML network.

Summary

Iri bepa rinoratidza nzira yekudhonza telemetry data uye kuiona nekugadzira magirafu. Iri bepa rinonyatso taura nezve AI/ML sensors, zvese zvemuno uye openconfig asi iyo setup inogona kushandiswa kune ese marudzi emasensor. Isu takabatanidzawo mhinduro dzezvakawanda nyaya dzaungasangana nazvo paunenge uchigadzira setup. Matanho uye zvakabuda zvinoratidzwa mubepa rino zvakanangana neshanduro dzeTIG stack yambotaurwa. Iyo inogona kuchinja zvichienderana neshanduro yesoftware, masensa uye iyo Junos vhezheni.

References

Juniper Yang Data Model Explorer kune ese ma sensor sarudzo
https://apps.juniper.net/ydm-explorer/
Openconfig forum yeopenconfig sensors
https://www.openconfig.net/projects/models/

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software - icon

Corporate uye Sales Headquarters
Nhoroondo ye Juniper Networks, Inc.
1133 Innovation Way
Sunnyvale, CA 94089 USA
Nharembozha: 888. JUNIPER (888.586.4737)
kana +1.408.745.2000
Fax: +1.408.745.2100
www.juniper.net
APAC uye EMEA Headquarters
Juniper Networks International BV
Boeing Avenue 240
1119 PZ Schiphol-Rijk
Amsterdam, Netherlands
Runhare: +31.207.125.700
Fax: +31.207.125.701
Copyright 2023 Juniper Networks. Inc. Kodzero dzeAil dzakachengetwa. Juniper Networks, iyo Juniper Networks logo, Juniper, Junos, uye mamwe marezinesi zviratidzo zvakanyoreswa zveJuniper Networks. inc. uye/kana vanobatana navo muUnited States nedzimwe nyika. Mamwe mazita anogona kunge ari matrademark evaridzi vawo. Juniper Networks haitore mutoro kune chero zvisizvo mugwaro iri. Juniper Networks inochengetedza kodzero yekuchinja. modify. kutamisa, kana kuti dzokorora bhuku rino pasina chiziviso.
Tumira mhinduro ku: design-center-comments@juniper.net V1.0/240807/ejm5-telemetry-junos-ai-ml

Zvinyorwa / Zvishandiso

Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software [pdf] Bhuku reMushandisi
Telemetry In Junos yeAI ML Workloads Software, Junos yeAI ML Workloads Software, AI ML Workloads Software, Workloads Software, Software

References

Siya mhinduro

Yako email kero haizoburitswa. Nzvimbo dzinodiwa dzakamakwa *