Telemetry ku Junos kwa AI/ML Workloads
Wolemba: Shalini Mukherjee
Mawu Oyamba
Popeza AI Cluster Traffc imafuna ma netiweki osatayika omwe ali ndi ma throughput ambiri komanso low latency, chinthu chofunikira kwambiri pa netiweki ya AI ndikusonkhanitsa deta yowunikira. Junos Telemetry imathandizira kuyang'anira pang'onopang'ono kwa zisonyezo zazikuluzikulu za kagwiridwe ka ntchito, kuphatikiza poyambira ndi zowerengera zowongolera kusokonekera komanso kusanja katundu. Magawo a gRPC amathandizira kusamutsa deta ya telemetry. gRPC ndi njira yamakono, yotseguka, yogwira ntchito kwambiri yomwe imamangidwa pamayendedwe a HTTP/2. Imapatsa mphamvu kutsatsira kwawoko komanso kumaphatikizapo metadata yosinthika pamitu yofunsira. Gawo loyamba mu telemetry ndikudziwa zomwe zikuyenera kusonkhanitsidwa. Kenako tikhoza kusanthula detayi m'njira zosiyanasiyana. Tikasonkhanitsa deta, ndikofunikira kuti tiwonetsere m'njira yosavuta kuyang'anira, kupanga zisankho ndikusintha ntchito zomwe zikuperekedwa. Papepalali, timagwiritsa ntchito telemetry stack yomwe ili ndi Telegraf, InfluxDB, ndi Grafana. Telemetry stack iyi imasonkhanitsa deta pogwiritsa ntchito mtundu wokankhira. Zitsanzo zachikale zokoka ndizofunika kwambiri, zimafuna kuchitapo kanthu pamanja, ndipo zingaphatikizepo mipata ya chidziwitso mu deta yomwe amasonkhanitsa. Mitundu yokankhira imagonjetsa zolepheretsa izi popereka deta mosasinthasintha. Amalemeretsa deta pogwiritsa ntchito ogwiritsa ntchito tags ndi mayina. Detayo ikangokhala m'mawonekedwe owerengeka, timayisunga mu database ndikuigwiritsa ntchito powonera web ntchito yowunikira maukonde. Chithunzi. 1 imatiwonetsa momwe stakitiyi idapangidwira kuti izitha kusonkhanitsa bwino deta, kusungidwa, ndi kuwonera, kuchokera pazida zama netiweki zomwe zimakankhira deta kupita kwa osonkhanitsa mpaka zomwe zikuwonetsedwa padashboard kuti ziunike.
Mtengo wa TIG
Tidagwiritsa ntchito seva ya Ubuntu kukhazikitsa mapulogalamu onse kuphatikiza ndi TIG stack.
Telegraph
Kusonkhanitsa deta, timagwiritsa ntchito Telegraf pa seva ya Ubuntu yomwe ikuyenda 22.04.2. Mtundu wa Telegraf womwe ukuyenda pachiwonetserochi ndi 1.28.5.
Telegraf ndi pulogalamu yowonjezera yomwe imayendetsedwa ndi seva yothandizira kusonkhanitsa ndi kupereka malipoti. Amagwiritsa ntchito purosesa plugins kulemeretsa ndikusintha deta. Zotsatira plugins amagwiritsidwa ntchito kutumiza deta ku masitolo osiyanasiyana. M'chikalata ichi timagwiritsa ntchito ziwiri plugins: imodzi ya openconfig sensors ndi ina ya Juniper native sensors.
InfluxDB
Kuti tisunge zidziwitso mumndandanda wanthawi, timagwiritsa ntchito InfluxDB. Pulogalamu yowonjezera mu Telegraf imatumiza deta ku InfluxDB, yomwe imayisunga m'njira yabwino kwambiri. Tikugwiritsa ntchito V1.8 popeza palibe CLI yomwe ilipo ya V2 ndi pamwambapa.
Grafana
Grafana amagwiritsidwa ntchito kuwonetsa deta iyi. Grafana imakoka zambiri kuchokera ku InfluxDB ndikulola ogwiritsa ntchito kupanga ma dashboard olemera komanso olumikizana. Apa, tikuyendetsa mtundu 10.2.2.
Kusintha Pa Kusintha
Kuti tigwiritse ntchito stack iyi, choyamba tiyenera kukonza kusintha monga momwe tawonetsera pa Chithunzi 2. Tagwiritsa ntchito port 50051. Doko lililonse lingagwiritsidwe ntchito pano. Lowani pakusintha kwa QFX ndikuwonjezera makonzedwe otsatirawa.
Zindikirani: Kukonzekera uku ndi kwa ma lab/POCs popeza mawu achinsinsi amaperekedwa momveka bwino. Gwiritsani ntchito SSL kuti mupewe izi.
Chilengedwe
Nginx
Izi ndizofunikira ngati simungathe kuwulula doko lomwe Grafana amakhala. Chotsatira ndikuyika nginx pa seva ya Ubuntu kuti ikhale ngati wothandizira wothandizira. Nginx ikangoyikidwa, onjezani mizere yomwe ikuwonetsedwa mu Chithunzi 4 ku fayilo "yosasinthika" ndikusuntha fayilo kuchokera ku /etc/nginx kupita ku /etc/nginx/sites-enabled.
Onetsetsani kuti firewall yasinthidwa kuti ipereke mwayi wonse ku utumiki wa nginx monga momwe tawonetsera pa Chithunzi 5.
Nginx ikangoyikidwa ndikusintha kofunikira, titha kupeza Grafana kuchokera ku a web osatsegula pogwiritsa ntchito adilesi ya IP ya seva ya Ubuntu komwe mapulogalamu onse amayikidwa.
Pali vuto laling'ono ku Grafana lomwe silikulolani kukonzanso mawu achinsinsi. Gwiritsani ntchito izi ngati mukukumana ndi vuto ili.
Njira zoyenera kuchitidwa pa seva ya Ubuntu kuti muyike mawu achinsinsi ku Grafana:
- Pitani ku /var/lib/grafana/grafana.db
- Ikani sqllite3
o sudo apt kukhazikitsa sqlite3 - Pangani lamulo ili pa terminal yanu
o sqlite3 grafana.db - Sqlite command prompt imatsegula; funsani funso ili:
> chotsani kuchokera kwa wogwiritsa ntchito pomwe login = 'admin' - Yambitsaninso grafana ndikulemba admin monga lolowera ndi mawu achinsinsi. Zimayambitsa mawu achinsinsi atsopano.
Mapulogalamu onse akakhazikitsidwa, pangani fayilo ya Config ku Telegraf yomwe ingathandize kukokera deta ya telemetry kuchokera pa switch ndikukankhira ku InfluxDB.
Pulogalamu ya Openconfig Sensor
Pa seva ya Ubuntu, sinthani fayilo ya /etc/telegraf/telegraf.conf kuti muwonjezere zonse zofunika. plugins ndi masensa. Pa masensa a openconfig, timagwiritsa ntchito pulogalamu yowonjezera ya gNMI yomwe yasonyezedwa pa chithunzi 6. Zolinga zowonetsera, onjezani dzina la olandila ngati "spine1", nambala ya doko "50051" yomwe imagwiritsidwa ntchito pa gRPC, dzina lolowera ndi mawu achinsinsi a switch, ndi nambala. ya masekondi kuti muyimbenso ngati yalephera.
Mu stanza yolembetsa, onjezani dzina lapadera, "cpu" la sensa iyi, njira ya sensa, ndi nthawi yotengera izi kuchokera pakusintha. Onjezani mapulagini omwewo.gnmi ndi inputs.gnmi.subscription pa masensa onse otsegula. (Chithunzi 6)
Native Sensor Plugin
Iyi ndi pulogalamu yowonjezera ya Juniper telemetry yomwe imagwiritsidwa ntchito pamasensa achilengedwe. Mufayilo yomweyi ya telegraf.conf, yonjezerani zolowetsa zamtundu wa sensa.jti_openconfig_telemetry kumene magawo ali pafupifupi ofanana ndi openconfig. Gwiritsani ntchito ID yapadera yamakasitomala pa sensa iliyonse; apa, timagwiritsa ntchito "telegraf3". Dzina lapadera lomwe limagwiritsidwa ntchito pano la sensa iyi ndi "mem" (Chithunzi 7).
Pomaliza, onjezani chowonjezera chotulutsa outputs.influxdb kuti mutumize deta ya sensor iyi ku InfluxDB. Apa, nkhokweyo imatchedwa "telegraf" ndi dzina lolowera "influx" ndi mawu achinsinsi "influxdb" (Chithunzi 8).
Mukakonza fayilo ya telegraf.conf, yambitsaninso ntchito ya telegraf. Tsopano, yang'anani mu InfluxDB CLI kuti muwonetsetse ngati miyeso idapangidwira masensa onse apadera. Lembani "influx" kuti mulowe mu InfluxDB CLI.
Monga tawonera mu Chithunzi. 9, lowetsani influxDB mwachangu ndikugwiritsa ntchito nkhokwe ya "telegraf". Mayina apadera operekedwa ku masensa amalembedwa ngati miyeso.
Kuti muwone zotsatira za muyeso umodzi uliwonse, kuti muwonetsetse kuti fayilo ya telegraf ndiyolondola ndipo sensa ikugwira ntchito, gwiritsani ntchito lamulo "sankhani * kuchokera ku cpu malire 1" monga momwe tawonetsera pa Chithunzi 10.
Nthawi zonse zosintha pa telegraf.conf file, onetsetsani kuti mwayimitsa InfluxDB, kuyambitsanso Telegraf, ndiyeno yambani InfluxDB.
Lowani ku Grafana kuchokera pa msakatuli ndikupanga dashboards mutaonetsetsa kuti deta ikusonkhanitsidwa molondola.
Pitani ku Malumikizidwe> InfuxDB> Onjezani gwero latsopano la data.
- Perekani dzina kugwero la deta. Mu chiwonetsero ichi ndi "test-1".
- Pansi pa HTTP stanza, gwiritsani ntchito seva ya Ubuntu IP ndi doko la 8086.
- Pazambiri za InfluxDB, gwiritsani ntchito dzina la database lomwelo, "telegraf," ndikupereka dzina lolowera ndi mawu achinsinsi a seva ya Ubuntu.
- Dinani Sungani & kuyesa. Onetsetsani kuti mwawona uthengawo, "wopambana".
- Pomwe gwero la data liwonjezedwa bwino, pitani ku Dashboards ndikudina Chatsopano. Tiyeni tipange ma dashboard ochepa omwe ali ofunikira pazantchito za AI/ML pamawonekedwe osintha.
ExampZithunzi za Sensor Graph
Otsatirawa ndi akaleampzowerengera zina zazikulu zomwe ndizofunikira pakuwunika maukonde a AI/ML.
PeresentitagKugwiritsa ntchito mawonekedwe a ingress et-0/0/0 pa msana-1
- Sankhani gwero la data ngati test-1.
- Mu gawo FROM, sankhani muyeso ngati "mawonekedwe". Ili ndi dzina lapadera lomwe limagwiritsidwa ntchito panjira iyi ya sensa.
- Mugawo la KULI, sankhani chipangizo::tag,ndi mu tag mtengo, sankhani dzina la omvera a switch, ndiko kuti, spine1.
- Mugawo la SELECT, sankhani nthambi ya sensa yomwe mukufuna kuyang'anira; pamenepa sankhani "munda(/interfaces/interface[if_name='et-0/0/0']/state/counters/if_in_1s_octets)". Tsopano m'gawo lomwelo, dinani "+" ndikuwonjezera masamu owerengera (/50000000000 * 100). Kwenikweni tikuwerengera kuchuluka kwaketage kugwiritsa ntchito mawonekedwe a 400G.
- Onetsetsani kuti FORMAT ndi "nthawi-nthawi," ndipo tchulani graph mu gawo la ALIAS.
Kukhala pamalo apamwamba pamzere uliwonse
- Sankhani gwero la data ngati test-1.
- Mu gawo FROM, sankhani muyeso ngati "buffer."
- Mugawo la KUTI, pali magawo atatu oti mudzaze. Sankhani chipangizo::tag,ndi mu tag mtengo sankhani dzina la omvera a switch (ie spine-1); NDIPO sankhani /cos/interfaces/interface/@name::tag ndikusankha mawonekedwe (ie et- 0/0/0); NDIPO sankhaninso mzere, /cos/interfaces/interface/queues/queue/@queue::tag ndikusankha pamzere nambala 4.
- Mugawo la SELECT, sankhani nthambi ya sensa yomwe mukufuna kuyang'anira; Pankhaniyi, sankhani "field(/cos/interfaces/interface/queues/queue/PeakBufferOccupancy)."
- Onetsetsani kuti FORMAT ndi "nthawi-mndandanda" ndikutchula graph mu gawo la ALIAS.
Mutha kusonkhanitsa deta yamitundu ingapo pa graph yomweyi monga momwe tawonera pa Chithunzi 17 cha et-0/0/0, et-0/0/1, et-0/0/2 etc.
PFC ndi ECN amatanthauza zochokera
Kuti mupeze tanthauzo (kusiyana kwa mtengo mkati mwa nthawi), gwiritsani ntchito njira yofunsira.
Ili ndiye funso lomwe tidagwiritsa ntchito kuti tipeze tanthauzo lapakati pamitengo iwiri ya PFC pa et-0/0/0 ya Spine-1 mumphindi.
SANKANI zotengera (kutanthauza(“/interfaces/interface[if_name='et-0/0/0′]/state/pfc-counter/tx_pkts”), 1s) KUCHOKERA ku “mawonekedwe” PALI (“chipangizo”:tag = 'Spine-1') NDI $timeFilter GROUP PAMENE ($interval)
SINANI zochokera (kutanthauza(“/interfaces/interface[if_name='et-0/0/8′]/state/error-counters/ecn_ce_marked_pkts”), 1s) KUCHOKERA ku “interface” KULI (“chipangizo”::tag = 'Spine-1') NDI $timeFilter GROUP PAMENE ($interval)
Zolakwika zazinthu zolowetsa zikutanthauza zotuluka
Funso laiwisi la zolakwika zazachuma kumatanthauza kuti:
SANKHANI zotumphukira(kutanthauza(“/interfaces/interface[if_name='et-0/0/0′]/state/error-counters/if_in_resource_errors”), 1s) KUCHOKERA ku “interface” KULI (“chipangizo”:tag = 'Spine-1') NDI $timeFilter GROUP PAMENE ($interval)
Madontho a mchira amatanthauza chochokera
Funso laiwisi la madontho a mchira limatanthauza kuti limachokera ku:
SANKHANI zotumphukira(kutanthauza("/cos/interfaces/interface/queues/queue/tailDropBytes”), 1s) KUCHOKERA ku “bu” PAPI (“chipangizo”::tag = 'Leaf-1' NDI “/cos/interfaces/interface/@name”::tag = 'et-0/0/0' NDI “/cos/interfaces/interface/queues/queue/@queue”::tag = '4') NDI $timeFilter GROUP BY time($__interval) fill(null)
Kugwiritsa ntchito CPU
- Sankhani gwero la data ngati test-1.
- Mugawo la FROM, sankhani muyeso ngati "newcpu"
- MU WHERE, pali magawo atatu oti mudzaze. Sankhani chipangizo::tag ndi mu tag mtengo sankhani dzina la omvera a switch (ie spine-1). NDI mu / zigawo / chigawo / katundu / katundu / dzina:tag, ndikusankha cpuutilization-total NDI m'dzina ::tag sankhani RE0.
- Mugawo la SELECT, sankhani nthambi ya sensor yomwe mukufuna kuyang'anira. Pankhaniyi, sankhani "munda (state / value)".
Funso losakanizidwa kuti mupeze chotuluka chopanda cholakwika cha mchira chimatsika pamasinthidwe angapo pamakina angapo mu bits/sekondi.
SANKHANI non_negative_derivative(kutanthauza(“/cos/interfaces/interface/queue/queue/tailDropBytes”), 1s)*8 KUCHOKERA KU “bu” PALI (chipangizo::tag =~ /^Spine-[1-2]$/) ndi (“/cos/interface/interface/@name”::tag =~ /et-0\/0\/[0-9]/ kapena “/cos/interfaces/interface/@name”::tag=~/et-0\/0\/1[0-5]/) NDI $timeFilter GROUP BY nthawi($__interval),chipangizo::tag dzaza (null)
Awa anali ena mwa ma exampzotsalira za ma graph omwe angapangidwe kuti aziwunikira netiweki ya AI/ML.
Chidule
Pepalali likuwonetsa njira yokokera deta ya telemetry ndikuyiwona popanga ma graph. Pepalali limalankhula makamaka za masensa a AI/ML, onse ambadwa ndi openconfig koma kukhazikitsidwako kumatha kugwiritsidwa ntchito pamitundu yonse ya masensa. Taphatikizanso mayankho pamavuto angapo omwe mungakumane nawo popanga zokhazikitsira. Masitepe ndi zotulukapo zomwe zawonetsedwa mu pepalali ndizokhazikika kumitundu ya TIG yomwe tatchula kale. Ikhoza kusintha kutengera mtundu wa pulogalamuyo, masensa ndi mtundu wa Junos.
Maumboni
Juniper Yang Data Model Explorer pazosankha zonse za sensor
https://apps.juniper.net/ydm-explorer/
Openconfig forum ya openconfig masensa
https://www.openconfig.net/projects/models/
Likulu la Makampani ndi Zogulitsa
Malingaliro a kampani Juniper Networks, Inc.
1133 Njira Yatsopano
Sunnyvale, CA 94089 USA
Foni: 888. JUNIPER (888.586.4737)
kapena +1.408.745.2000
Fax: +1.408.745.2100
www.juniper.net
APAC ndi EMEA Likulu
Juniper Networks International BV
Boeing Avenue 240
1119 PZ Schiphol-Rijk
Amsterdam, Netherlands
Foni: +31.207.125.700
Fax: +31.207.125.701
Copyright 2023 Juniper Networks. Ufulu wa Ail ndi wotetezedwa. Juniper Networks, logo ya Juniper Networks, Juniper, Junos, ndi zizindikilo zina ndi zilembo zolembetsedwa za Juniper Networks. inc. ndi/kapena ogwirizana nawo ku United States ndi mayiko ena. Mayina ena akhoza kukhala zizindikiro za eni ake. Juniper Networks sakhala ndi udindo pazolakwika zilizonse m'chikalatachi. Juniper Networks ali ndi ufulu wosintha. sintha. kusamutsa, kapena sinthaninso bukuli popanda chidziwitso.
Tumizani ndemanga kwa: design-center-comments@juniper.net V1.0/240807/ejm5-telemetry-junos-ai-ml
Zolemba / Zothandizira
![]() |
Juniper NETWORKS Telemetry Mu Junos ya AI ML Workloads Software [pdf] Buku Logwiritsa Ntchito Telemetry In Junos ya AI ML Workloads Software, Junos ya AI ML Workloads Software, AI ML Workloads Software, Workloads Software, Software |