Telemetry muJunos yeAI/ML Workloads
Munyori: Shalini Mukherjee
Nhanganyaya
Sezvo AI cluster traific inoda kurasikirwa netiweki ine yakakwira throughput uye yakaderera latency, chinhu chakakosha cheAI network kuunganidzwa kwe data rekutarisa. Junos Telemetry inogonesa kutarisisa kwegranular kweakakosha maratidziro ekuita, kusanganisira zvikumbaridzo uye zviverengero zvekugadzirisa congestion uye traffc mutoro kuenzanisa. gRPC zvikamu zvinotsigira kutenderera kwetelemetry data. gRPC inguva yemazuva ano, yakavhurika-sosi, yakakwira dhizaini inovakwa paHTTP/2 yekufambisa. Inopa masimba ekuzvarwa bidirectional yekufambisa hunyanzvi uye inosanganisira inochinjika tsika-metadata mune yekukumbira misoro. Nhanho yekutanga mu telemetry ndeye kuziva kuti ndeipi data inofanirwa kuunganidzwa. Tinogona ipapo kuongorora iyi data mumhando dzakasiyana. Kana tangounganidza data, zvakakosha kuti tiratidze muchimiro chiri nyore kutarisa, kuita sarudzo uye kugadzirisa sevhisi iri kupihwa. Mubepa rino, tinoshandisa telemetry stack inosanganisira Telegraf, InfluxDB, uye Grafana. Iyi telemetry stack inounganidza data uchishandisa push modhi. Mamodheru ekukwevera echinyakare anonyanya kushandisa zviwanikwa, anoda kupindira kwemanyorero, uye anogona kusanganisira mabheji eruzivo mune data raanounganidza. Push modhi inokunda izvi zvipimo nekuendesa data asynchronously. Vanopfumisa iyo data nekushandisa mushandisi-ane hushamwari tags uye mazita. Kana iyo data iri mune inoverengeka fomati, tinoichengeta mudhatabhesi uye tinoishandisa mune inopindirana yekuona web application yekuongorora network. Mufananidzo. 1 inotiratidza kuti stack iyi yakagadzirirwa sei kunyatso kuunganidza, kuchengetedza, uye kuona, kubva kunetiweki zvishandiso zvinosundira data kumuunganidzi kune iyo data inoratidzwa pamadhibhodhi kuti aongororwe.
TIG Stack
Isu takashandisa Ubuntu sevha kuisa software yese kusanganisira iyo TIG stack.
Telegraph
Kuti titore data, tinoshandisa Telegraf pane Ubuntu server inomhanya 22.04.2. Iyo Telegraf vhezheni inoshanda mune iyi demo ndeye 1.28.5.
Telegraf ndeye plugin inofambiswa sevha mumiriri wekuunganidza uye kushuma metrics. Inoshandisa processor plugins kupfumisa uye kugadzirisa data. The output plugins anoshandiswa kutumira iyi data kuzvitoro zvakasiyana siyana. Mugwaro iri tinoshandisa maviri plugins: imwe yeopenconfig sensors uye imwe yeJuniper native sensors.
InfluxDB
Kuchengeta iyo data mune yenguva yakatevedzana dhatabhesi, isu tinoshandisa InfluxDB. Iyo inoburitsa plugin muTelegraf inotumira iyo data kuInfluxDB, iyo inoichengeta nenzira ine hunyanzvi. Tiri kushandisa V1.8 sezvo pasina CLI iripo yeV2 uye pamusoro.
Grafana
Grafana inoshandiswa kuona iyi data. Grafana inodhonza data kubva kuInfluxDB uye inobvumira vashandisi kugadzira akapfuma uye anodyidzana dashboard. Pano, tiri kumhanya shanduro 10.2.2.
Configuration On The Switch
Kuti tishandise stack iyi, tinoda kutanga tagadzirisa shanduko sezvakaratidzwa mumufananidzo 2. Tashandisa port 50051. Chero chipi nechipi chinogona kushandiswa pano. Pinda mukati QFX switch uye wedzera iyo inotevera gadziriso.
Cherechedza: Iyi gadziriso ndeyeLabs/POCs sezvo password ichifambiswa mune yakajeka mavara. Shandisa SSL kudzivirira izvi.
Environment
Nginx
Izvi zvinodikanwa kana iwe usingakwanise kuburitsa pachena chiteshi chakatambirwa Grafana. Nhanho inotevera ndeyekuisa nginx paUbuntu server kuti ishande sereverse proxy agent. Kana nginx yaiswa, wedzera mitsetse inoratidzwa muMufananidzo 4 kune "default" faira uye fambisa faira kubva /etc/nginx kuenda /etc/nginx/sites-enabled.
Ita shuwa kuti firewall yakagadziridzwa kuti ipe mukana wakazara kune nginx sevhisi sezvakaratidzwa muFigure 5.
Kana nginx yaiswa uye shanduko dzinodiwa dzaitwa, isu tinofanirwa kukwanisa kuwana Grafana kubva ku web browser nekushandisa IP kero yeUbuntu server uko software yese yakaiswa.
Pane diki glitch muGrafana iyo isingakutendere iwe kuseta zvakare default password. Shandisa matanho aya kana iwe ukasangana nedambudziko iri.
Matanho anofanirwa kuitwa paUbuntu server kuseta password muGrafana:
- Enda ku /var/lib/grafana/grafana.db
- Isa sqllite3
o sudo apt kuisa sqlite3 - Mhanya uyu murairo pane yako terminal
o sqlite3 grafana.db - Sqlite command prompt inovhura; mhanya mubvunzo unotevera:
> bvisa kubva kumushandisi uko login = 'admin' - Tangazve grafana uye nyora admin sezita rekushandisa uye password. Inokumbira password itsva.
Kana software yese yaiswa, gadzira iyo config faira muTelegraf iyo inozobatsira kudhonza iyo telemetry data kubva pane switch uye kuisundira kuInfluxDB.
Openconfig Sensor Plugin
PaUbuntu server, gadzirisa /etc/telegraf/telegraf.conf faira kuti uwedzere zvese zvinodiwa. plugins uye sensors. Kune maopenconfig sensors, tinoshandisa gNMI plugin inoratidzwa muFigure 6. Nezvinangwa zvedemo, wedzera zita remugamuchiri se "spine1", nhamba yechiteshi "50051" inoshandiswa gRPC, zita rekushandisa nepassword ye switch, uye nhamba. kwemasekonzi ekudaidzira zvakare kana yatadza.
Muchikamu chekunyorera, wedzera zita rakasiyana, "cpu" yeiyi sensor, nzira ye sensor, uye nguva yekubata iyi data kubva pachinja. Wedzera zvakafanana plugin inputs.gnmi uye inputs.gnmi.subscription kune ese akavhurika masensa masensa. (Mufananidzo 6)
Native Sensor Plugin
Iyi iJuniper telemetry interface plugin inoshandiswa kune vemunharaunda masensa. Mufaira rimwe chete retelegraf.conf, wedzera iyo yemuno sensor plugin inputs.jti_openconfig_telemetry apo minda yakada kufanana neye openconfig. Shandisa yakasarudzika mutengi ID kune yega sensor; pano, tinoshandisa "telegraf3". Zita rakasiyana-siyana rinoshandiswa pano kune iyi sensor ndeye "mem" (Mufananidzo 7).
Chekupedzisira, wedzera inobuda plugin outputs.influxdb kutumira iyi sensor data kuInfluxDB. Pano, database inonzi "telegraf" ine zita rekushandisa se "influx" uye password "influxdb" (Mufananidzo 8).
Kana uchinge wagadzirisa faira retelegraf.conf, tangazve sevhisi yerunhare. Zvino, tarisa muInfluxDB CLI kuti uone kana zviyero zvakagadzirirwa ese akasarudzika sensors. Nyora "influx" kuti upinde muInfluxDB CLI.
Sezvinoonekwa muMufananidzo. 9, pinda iyo influxDB kukurumidza uye shandisa dhatabhesi "telegraf". Mazita ese akasarudzika akapihwa masensa akanyorwa sezviyero.
Kuti uone kubuda kwechero chiyero chimwe chete, kungoita chokwadi chekuti telegraf faira rakarurama uye sensor iri kushanda, shandisa murairo wekuti "sarudza * kubva cpu muganhu 1" sezvakaratidzwa mumufananidzo 10.
Pese panoitwa shanduko kune telegraf.conf faira, ita shuwa kumisa InfluxDB, tangazve Telegraf, uye wobva watanga InfluxDB.
Pinda kuGrafana kubva kubrowser uye gadzira dashboard mushure mekuona kuti data iri kuunganidzwa nemazvo.
Enda kuZvibatanidza> InfuxDB> Wedzera nyowani data sosi.
- Ipa zita kune iyi data source. Mune iyi demo ndeye "test-1".
- Pasi peiyo HTTP stanza, shandisa iyo Ubuntu server IP uye 8086 port.
- Mune iyo InfluxDB dhata, shandisa iro rimwechete dhatabhesi zita, "telegraf," uye ipa zita rekushandisa uye password yeUbuntu server.
- Dzvanya Save & test. Ita shuwa kuti iwe unoona meseji, "yakabudirira".
- Kana iyo data sosi yawedzerwa zvinobudirira, enda kune Dashboards uye tinya Nyowani. Ngatigadzirei mashoma madhibhodhi akakosha kune AI/ML basa rekutakura mune edhita modhi.
ExampZvikamu zveSensor Grafu
Vanotevera ndeva exampmashoma emamwe macounter makuru akakosha pakutarisa AI/ML network.
Percentage kushandiswa kweiyo ingress interface et-0/0/0 pamuzongoza-1
- Sarudza iyo data sosi seyedzo-1.
- Muchikamu cheFROM, sarudza kuyerwa se "interface". Iri ndiro zita rakasiyana rinoshandiswa kune iyi sensor nzira.
- Muchikamu cheWHERE, sarudza mudziyo::tag,uye mu tag kukosha, sarudza zita rekutambira rekuchinja, kureva, musana1.
- Muchikamu cheSARUDZA, sarudza iyo sensor bazi raunoda kutarisa; munyaya iyi sarudza "munda(/interfaces/interface[if_name='et-0/0/0']/state/counters/if_in_1s_octets)". Iye zvino muchikamu chimwe chete, tinya pa "+" uye wedzera masvomhu aya ekuverenga (/ 50000000000 * 100). Isu tiri kuverengera muzanatage kushandiswa kwe400G interface.
- Ita shuwa kuti FORMAT ndeye "nguva-yakatevedzana," uye zita girafu muchikamu cheALIAS.
Peak buffer occupancy kune chero mutsetse
- Sarudza iyo data sosi seyedzo-1.
- Muchikamu cheFROM, sarudza kuyerwa se "buffer."
- Muchikamu cheWHERE, pane minda mitatu yekuzadza. Sarudza mudziyo::tag,uye mu tag kukosha sarudza zita rekutambira rekuchinja (kureva musana-1); UYE sarudza /cos/interfaces/interface/@zita ::tag uye sarudza iyo interface (kureva et- 0/0/0); UYE sarudzawo mutsara, /cos/interfaces/interface/queues/queue/@queue::tag uye sarudza mutsara nhamba 4.
- Muchikamu cheSARUDZA, sarudza iyo sensor bazi raunoda kutarisa; munyaya iyi sarudza "munda(/cos/interfaces/interface/queues/queue/PeakBufferOccupancy)."
- Ita shuwa kuti FORMAT ndeye "nguva-yakatevedzana" uye zita girafu muchikamu cheALIAS.
Iwe unogona kuunganidza data kune akawanda mainterfaces pane imwecheteyo girafu sezvinoonekwa muMufananidzo 17 we et-0/0/0, et-0/0/1, et-0/0/2 nezvimwe.
PFC uye ECN zvinoreva kubva
Kuti uwane zvinoreva kubva (musiyano mukukosha mukati menguva yakatarwa), shandisa nzira yekubvunza.
Uyu ndiwo mubvunzo wekupinda mukati watakashandisa kuwana zvinoreva kubva pakati pembiri PFC tsika pa et-0/0/0 yeSpine-1 muchikamu.
SARUDZA kubva (zvinoreva(“/interfaces/interface[if_name='et-0/0/0′]/state/pfc-counter/tx_pkts”), 1s) KUBVA ku“interface” PANO (“mudziyo”:tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)
SARUDZA zvinobuda (zvinoreva(“/interfaces/interface[if_name='et-0/0/8′]/state/error-counters/ecn_ce_marked_pkts”), 1s) KUBVA ku“interface” KUPI (“mudziyo”::tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)
Input resource kukanganisa zvinoreva kubva
The raw query yekukanganisa resource zvinoreva kubva kune:
SARUDZA zvinobuda(zvinoreva(“/interfaces/interface[if_name='et-0/0/0′]/state/error-counters/if_in_resource_errors”), 1s) KUBVA ku“interface” PAPI (“mudziyo”:tag = 'Spine-1') UYE $timeFilter GROUP NENguva($interval)
Madonhwe emuswe anoreva kubva
The raw query yemadonhwe emuswe zvinoreva kubva kune:
SARUDZA zvinobuda (zvinoreva(“/cos/interfaces/interface/queue/queue/tailDropBytes”), 1s) KUBVA ku“bu” PAPI (“mudziyo”::tag = 'Leaf-1' UYE “/cos/interfaces/interface/@zita”::tag = 'et-0/0/0' UYE “/cos/interface/interface/queues/queue/@queue”::tag = '4') UYE $timeFilter GROUP BY time($__interval) zadza(null)
CPU kushandiswa
- Sarudza iyo data sosi seyedzo-1.
- Muchikamu cheFROM, sarudza kuyerwa se "newcpu"
- MuIKO, pane minda mitatu yekuzadza. Sarudza mudziyo::tag uye mu tag kukosha sarudza zita rekutambira rekuchinja (kureva musana-1). UYE mu/zvikamu/chikamu/properties/property/zita:tag, uye sarudza cpuutilization-total AND muzita ::tag sarudza RE0.
- Muchikamu cheSARUDZA, sarudza bazi re sensor raunoda kutarisa. Muchiitiko ichi, sarudza "munda (nyika / kukosha)".
Muvhunzo wekutsvaga kusiri-negative kubva kumuswe kunodonha kune akawanda maswichi pamadiresi akawanda mumabits/sec.
SARUDZA zvisiri_negative_derivative(zvinoreva(“/cos/interfaces/interface/queue/queue/tailDropBytes”), 1s)*8 KUBVA “bufer” PASI (mudziyo::tag =~ /^Spine-[1-2]$/) uye (“/cos/interfaces/interface/@zita”::tag =~ /et-0\/0\/[0-9]/ kana “/cos/interfaces/interface/@name”::tag=~/et-0\/0\/1[0-5]/) UYE $timeFilter GROUP BY time($__interval),mudziyo::tag zadza (null)
Ava vaive vamwe ve exampmashoma emagirafu anogona kugadzirwa kuti atarise AI/ML network.
Summary
Iri bepa rinoratidza nzira yekudhonza telemetry data uye kuiona nekugadzira magirafu. Iri bepa rinonyatso taura nezve AI/ML sensors, zvese zvemuno uye openconfig asi iyo setup inogona kushandiswa kune ese marudzi emasensor. Isu takabatanidzawo mhinduro dzezvakawanda nyaya dzaungasangana nazvo paunenge uchigadzira setup. Matanho uye zvakabuda zvinoratidzwa mubepa rino zvakanangana neshanduro dzeTIG stack yambotaurwa. Iyo inogona kuchinja zvichienderana neshanduro yesoftware, masensa uye iyo Junos vhezheni.
References
Juniper Yang Data Model Explorer kune ese ma sensor sarudzo
https://apps.juniper.net/ydm-explorer/
Openconfig forum yeopenconfig sensors
https://www.openconfig.net/projects/models/
Corporate uye Sales Headquarters
Nhoroondo ye Juniper Networks, Inc.
1133 Innovation Way
Sunnyvale, CA 94089 USA
Nharembozha: 888. JUNIPER (888.586.4737)
kana +1.408.745.2000
Fax: +1.408.745.2100
www.juniper.net
APAC uye EMEA Headquarters
Juniper Networks International BV
Boeing Avenue 240
1119 PZ Schiphol-Rijk
Amsterdam, Netherlands
Runhare: +31.207.125.700
Fax: +31.207.125.701
Copyright 2023 Juniper Networks. Inc. Kodzero dzeAil dzakachengetwa. Juniper Networks, iyo Juniper Networks logo, Juniper, Junos, uye mamwe marezinesi zviratidzo zvakanyoreswa zveJuniper Networks. inc. uye/kana vanobatana navo muUnited States nedzimwe nyika. Mamwe mazita anogona kunge ari matrademark evaridzi vawo. Juniper Networks haitore mutoro kune chero zvisizvo mugwaro iri. Juniper Networks inochengetedza kodzero yekuchinja. modify. kutamisa, kana kuti dzokorora bhuku rino pasina chiziviso.
Tumira mhinduro ku: design-center-comments@juniper.net V1.0/240807/ejm5-telemetry-junos-ai-ml
Zvinyorwa / Zvishandiso
![]() |
Juniper NETWORKS Telemetry MuJunos yeAI ML Workloads Software [pdf] Bhuku reMushandisi Telemetry In Junos yeAI ML Workloads Software, Junos yeAI ML Workloads Software, AI ML Workloads Software, Workloads Software, Software |