Telemetry ma Junos no nā hana hana AI/ML
Mea kākau: Shalini Mukherjee
Hoʻolauna
No ka mea e koi ana ka AI cluster traflc i nā pūnaewele poho me ka puka kiʻekiʻe a me ka latency haʻahaʻa, ʻo kahi mea koʻikoʻi o ka pūnaewele AI ʻo ka hōʻiliʻili ʻana i ka ʻikepili nānā. Hiki iā Junos Telemetry ke nānā pono i nā hōʻailona hana koʻikoʻi, me nā paepae a me nā helu helu no ka hoʻokele congestion a me ke kau ʻana o ka ukana. Kākoʻo nā kau gRPC i ke kahe ʻana o ka ʻikepili telemetry. ʻO ka gRPC kahi hana hou, open-source, hana kiʻekiʻe i kūkulu ʻia ma HTTP/2 transport. Hāʻawi ia i nā mana hoʻoheheʻe bidirectional maoli a loaʻa i nā metadata maʻamau maʻamau i nā poʻomanaʻo noi. ʻO ka hana mua i ka telemetry e ʻike i ka ʻikepili e hōʻiliʻili ʻia. Hiki iā mākou ke kālailai i kēia ʻikepili ma nā ʻano like ʻole. Ke hōʻiliʻili mākou i ka ʻikepili, pono e hōʻike ʻia ma kahi ʻano maʻalahi ke nānā, hoʻoholo a hoʻomaikaʻi i ka lawelawe e hāʻawi ʻia nei. Ma kēia pepa, hoʻohana mākou i kahi hoʻopaʻa telemetry e pili ana iā Telegraf, InfluxDB, a me Grafana. ʻOhi kēia waihona telemetry i ka ʻikepili me ka hoʻohana ʻana i ke kumu hoʻohālike. ʻO nā kumu hoʻohālike kuʻuna he mea waiwai nui, koi i ka hana lima, a hiki ke hoʻokomo i nā hakahaka ʻike i ka ʻikepili a lākou e hōʻiliʻili ai. Hoʻopau nā kumu hoʻohālike i kēia mau palena ma o ka hoʻopuka ʻana i ka ʻikepili asynchronously. Hoʻonui lākou i ka ʻikepili me ka hoʻohana ʻana i ka mea hoʻohana tags a me na inoa. Ke loaʻa ka ʻikepili i kahi ʻano hiki ke heluhelu ʻia, mālama mākou iā ia i loko o kahi waihona a hoʻohana iā ia i kahi hiʻohiʻona pili web noi no ke kālailai ʻana i ka pūnaewele. Kiʻi. Hōʻike ʻo 1 iā mākou i ka hoʻolālā ʻia ʻana o kēia pūʻulu no ka hōʻiliʻili ʻana i ka ʻikepili, mālama, a me ka nānā ʻana, mai nā polokalamu pūnaewele e hoʻolei ana i ka ʻikepili i ka mea hōʻiliʻili a i ka ʻikepili i hōʻike ʻia ma nā dashboards no ka nānā ʻana.
TIG ahu
Ua hoʻohana mākou i kahi kikowaena Ubuntu e hoʻokomo i nā polokalamu āpau me ka TIG stack.
Telegarapa
No ka hōʻiliʻili ʻana i ka ʻikepili, hoʻohana mākou i Telegraf ma kahi kikowaena Ubuntu e holo ana i ka 22.04.2. ʻO ka mana Telegraf e holo nei i kēia demo ʻo 1.28.5.
ʻO Telegraf kahi mea hoʻohana kikowaena plugin no ka hōʻiliʻili a me ka hōʻike ʻana i nā metric. Hoʻohana ia i ka mea hana plugins e hoʻonui a maʻamau i ka ʻikepili. ʻO ka hoʻopuka plugins hoʻohana ʻia e hoʻouna i kēia ʻikepili i nā hale kūʻai ʻikepili like ʻole. Ma kēia palapala hoʻohana mākou i ʻelua plugins: hoʻokahi no nā mea ʻike openconfig a ʻo kekahi no nā mea ʻike maoli ʻo Juniper.
InfluxDB
No ka mālama ʻana i ka ʻikepili i loko o kahi waihona manawa, hoʻohana mākou iā InfluxDB. Hoʻouna ka plugin output ma Telegraf i ka ʻikepili i InfluxDB, kahi e mālama ai iā ia ma kahi ʻano ʻoi loa. Ke hoʻohana nei mākou i ka V1.8 no ka mea ʻaʻohe CLI no V2 a ma luna.
Grafana
Hoʻohana ʻia ʻo Grafana e nānā i kēia ʻikepili. Huki ʻo Grafana i ka ʻikepili mai InfluxDB a hiki i nā mea hoʻohana ke hana i nā dashboards waiwai a pili. Eia, ke holo nei mākou i ka mana 10.2.2.
Hoʻonohonoho ma ka Switch
No ka hoʻokō ʻana i kēia pūʻulu, pono mua mākou e hoʻonohonoho i ka hoʻololi e like me ka mea i hōʻike ʻia ma ka Figure 2. Ua hoʻohana mākou i ke awa 50051. Hiki ke hoʻohana ʻia kekahi awa ma aneʻi. E komo i ka hoʻololi QFX a hoʻohui i kēia hoʻonohonoho.
Nānā: No nā labs/POC kēia hoʻonohonoho ʻana no ka mea ua hoʻouna ʻia ka ʻōlelo huna ma kahi kikokikona maʻemaʻe. E hoʻohana i ka SSL e pale aku i kēia.
Kaiapuni
Nginx
Pono kēia inā ʻaʻole hiki iā ʻoe ke hōʻike i ke awa kahi i mālama ʻia ai ʻo Grafana. ʻO ka hana aʻe e hoʻokomo i ka nginx ma ka kikowaena Ubuntu e lawelawe ma ke ʻano he mea hoʻololi hope. Ke hoʻokomo ʻia ka nginx, e hoʻohui i nā laina i hōʻike ʻia ma ka Figure 4 i ka faila "default" a neʻe i ka faila mai /etc/nginx i /etc/nginx/sites-enabled.
E hōʻoia i ka hoʻoponopono ʻana i ka pā ahi e hāʻawi i ke komo piha i ka lawelawe nginx e like me ka hōʻike ʻana ma ke Kiʻi 5.
Ke hoʻokomo ʻia ka nginx a hana ʻia nā hoʻololi i koi ʻia, pono mākou e komo iā Grafana mai a web polokalamu kele pūnaewele ma ka hoʻohana ʻana i ka IP address o ka server Ubuntu kahi i hoʻokomo ʻia ai nā polokalamu a pau.
Aia kahi glitch liʻiliʻi ma Grafana ʻaʻole e ʻae iā ʻoe e hoʻonohonoho hou i ka ʻōlelo huna. E hoʻohana i kēia mau ʻanuʻu inā loaʻa ʻoe i kēia pilikia.
Nā hana e hana ʻia ma ka server Ubuntu e hoʻonohonoho i ka ʻōlelo huna ma Grafana:
- E hele i /var/lib/grafana/grafana.db
- E hoʻouka i ka sqlite3
o sudo apt hoʻokomo i ka sqlite3 - E holo i kēia kauoha ma kāu kikowaena
o sqlite3 grafana.db - Wehe ʻia ke kauoha kauoha Sqlite; holo i kēia nīnau:
> holoi mai ka mea hoʻohana kahi e komo ai = 'admin' - E hoʻomaka hou i ka grafana a e kākau i ka admin ma ke ʻano he inoa inoa a me ka ʻōlelo huna. Ke koi nei ia no ka ʻōlelo huna hou.
Ke hoʻokomo ʻia ka polokalamu a pau, e hana i ka faila config ma Telegraf e kōkua i ka huki ʻana i ka ʻikepili telemetry mai ka hoʻololi a hoʻokuʻu iā InfluxDB.
Openconfig Sensor Plugin
Ma ka kikowaena ʻo Ubuntu, hoʻoponopono i ka faila /etc/telegraf/telegraf.conf e hoʻohui i nā mea pono a pau plugins a me na mea ike. No nā mea ʻike openconfig, hoʻohana mākou i ka plugin gNMI i hōʻike ʻia ma ke Kiʻi 6. No nā kumu demo, e hoʻohui i ka inoa host e like me "spine1", ka helu port "50051" i hoʻohana ʻia no gRPC, ka inoa inoa a me ka ʻōlelo huna o ka hoʻololi, a me ka helu. o nā kekona no ke kelepona hou ʻana inā hāʻule.
Ma ka stanza kau inoa, e hoʻohui i kahi inoa kūʻokoʻa, "cpu" no kēia ʻike kikoʻī, ke ala ʻike, a me ka manawa manawa no ka hopu ʻana i kēia ʻikepili mai ka hoʻololi. E hoʻohui i ka plugin inputs.gnmi a me inputs.gnmi.subscription no nā mea ʻike kikowaena wehe. (Kiʻi 6)
Pākuʻi Native Sensor
ʻO kēia kahi Juniper telemetry interface plugin i hoʻohana ʻia no nā mea ʻike maoli. Ma ka waihona telegraf.conf hoʻokahi, e hoʻohui i ka hoʻokomo ʻana i ka mea hoʻokomo sensor maoli.jti_openconfig_telemetry kahi e like ai nā kahua me ka openconfig. E hoʻohana i kahi ID mea kūʻai kūʻokoʻa no kēlā me kēia sensor; maanei, hoʻohana mākou i "telegraf3". ʻO ka inoa kūʻokoʻa i hoʻohana ʻia ma ʻaneʻi no kēia sensor ʻo "mem" (Figure 7).
ʻO ka hope, hoʻohui i kahi plugin outputs outputs.influxdb e hoʻouna i kēia ʻikepili ʻike i InfluxDB. Ma ʻaneʻi, kapa ʻia ka waihona ʻo "telegraf" me ka inoa inoa "influx" a me ka ʻōlelo huna "influxdb" (Figure 8).
Ke hoʻoponopono ʻoe i ka faila telegraf.conf, e hoʻomaka hou i ka lawelawe telegraf. I kēia manawa, e nānā i loko o ka InfluxDB CLI e hōʻoia inā hana ʻia nā ana no nā mea ʻike kūʻokoʻa āpau. Kaomi "influx" e komo i ka InfluxDB CLI.
E like me ka mea i ikeia ma ke Kii. 9, e hoʻokomo i ka influxDB wikiwiki a hoʻohana i ka waihona "telegraf". Hoʻopaʻa inoa ʻia nā inoa kūʻokoʻa i hāʻawi ʻia i nā mea ʻike ma ke ʻano he ana.
No ka ʻike ʻana i ka hoʻopuka ʻana o kekahi ana hoʻokahi, e hōʻoia wale i ka pololei o ka faila telegraf a ke hana nei ka sensor, e hoʻohana i ke kauoha "koho * mai ka palena cpu 1" e like me ka hōʻike ʻana ma ke Kiʻi 10.
I kēlā me kēia manawa e hoʻololi ʻia ka faila telegraf.conf, e hōʻoia e hoʻōki i ka InfluxDB, e hoʻomaka hou iā Telegraf, a laila e hoʻomaka iā InfluxDB.
E hoʻopaʻa inoa iā Grafana mai ka polokalamu kele pūnaewele a hana i nā dashboards ma hope o ka hōʻoia ʻana e hōʻiliʻili pono ʻia ka ʻikepili.
E hele i Connections > InfuxDB > Add new data source.
- Hāʻawi i kahi inoa i kēia kumu ʻikepili. Ma kēia demo ʻo ia ka "test-1".
- Ma lalo o ka HTTP stanza, e hoʻohana i ka IP server Ubuntu a me ke awa 8086.
- Ma nā kikoʻī InfluxDB, e hoʻohana i ka inoa ʻikepili like, "telegraf," a hāʻawi i ka inoa inoa a me ka ʻōlelo huna o ka server Ubuntu.
- Kaomi iā Save & hoʻāʻo. E hōʻoia e ʻike ʻoe i ka memo, "holomua".
- Ke hoʻohui maikaʻi ʻia ke kumu ʻikepili, e hele i Dashboards a kaomi hou. E hana mākou i kekahi mau dashboards pono no nā haʻawina AI/ML ma ke ʻano hoʻoponopono.
Examples O Na Kii Kipi
Eia nā exampnā helu o kekahi mau helu helu nui e pono ai no ka nānā ʻana i kahi pūnaewele AI/ML.
pākēnekatagka hoʻohana ʻana no kahi kikowaena komo et-0/0/0 ma ka spine-1
- E koho i ke kumu ʻikepili e like me ka hoʻāʻo-1.
- Ma ka ʻāpana FROM, koho i ke ana ma ke ʻano he "interface". ʻO kēia ka inoa kūikawā i hoʻohana ʻia no kēia ala ʻike.
- Ma ka ʻāpana WHERE, koho i ka mea hana::tag, a ma ka tag waiwai, koho i ka inoa hoʻokipa o ka hoʻololi, ʻo ia hoʻi, spine1.
- Ma ka ʻāpana SELECT, koho i ka lālā sensor āu e makemake ai e nānā; i kēia hihia, koho i ka "field(/interfaces/interface[if_name='et-0/0/0']/state/counters/if_in_1s_octets)". I kēia manawa ma ka ʻāpana like, kaomi ma ka "+" a hoʻohui i kēia helu helu (/50000000000 * 100). Ke helu nei mākou i ka pākēnekatage hoʻohana i kahi kikowaena 400G.
- E hōʻoia i ka FORMAT he "kau-kau," a inoa i ka pakuhi ma ka ʻāpana ALIAS.
ʻO ka piʻi o ka noho ʻana no kēlā me kēia laina
- E koho i ke kumu ʻikepili e like me ka hoʻāʻo-1.
- Ma ka ʻāpana FROM, koho i ke ana ma ke ʻano he "buffer."
- Ma ka ʻāpana WHERE, ʻekolu mau kahua e hoʻopiha ai. E koho i ka mea hana::tag, a ma ka tag koho i ka inoa hoʻokipa o ka hoʻololi (ie spine-1); A koho i /cos/interfaces/interface/@name::tag a koho i ka interface (ie et- 0/0/0); A koho pū i ka pila, /cos/interfaces/interface/queues/queue/@queue::tag a koho i ka helu helu 4.
- Ma ka ʻāpana SELECT, koho i ka lālā sensor āu e makemake ai e nānā; i kēia hihia, koho i "field(/cos/interfaces/interface/queues/queue/PeakBufferOccupancy)."
- E hōʻoia i ka FORMAT he "kau-kau" a inoa i ka pakuhi ma ka ʻāpana ALIAS.
Hiki iā ʻoe ke hōʻiliʻili i nā ʻikepili no nā pilina lehulehu ma ka pakuhi like e like me ka mea i ʻike ʻia ma ke Kiʻi 17 no et-0/0/0, et-0/0/1, et-0/0/2 etc.
PFC a me ECN mean derivative
No ka 'ike 'ana i ka mean derivative (ka 'oko'a o ka waiwai i loko o ka palena manawa), e ho'ohana i ke 'ano nīnau maka'ala.
ʻO kēia ka nīnau hoʻokomo a mākou i hoʻohana ai e ʻike i ka derivative mean ma waena o ʻelua mau waiwai PFC ma et-0/0/0 o Spine-1 i kekona.
SELECT derivative(mean(“/interfaces/interface[if_name='et-0/0/0′]/state/pfc-counter/tx_pkts”), 1s) MAI “interface” WHERE (“mea hana”::tag = 'Spine-1') A me $timeFilter GROUP MA ka manawa($interval)
SELECT derivative(mean (“/interfaces/interface[if_name='et-0/0/8′]/state/error-counters/ecn_ce_marked_pkts”), 1s) MAI “interface” WHERE (“mea hana”::tag = 'Spine-1') A me $timeFilter GROUP MA ka manawa($interval)
Loaʻa nā hewa kumu hoʻokomo
ʻO ke ʻano o ka hulina makaʻala no nā hewa kumu waiwai ʻo ia ka derivative:
SELECT derivative(mean (“/interfaces/interface[if_name='et-0/0/0′]/state/error-counters/if_in_resource_errors”), 1s) MAI “interface” WHERE (“mea hana”::tag = 'Spine-1') A me $timeFilter GROUP MA ka manawa($interval)
Hule huelo mean derivative
ʻO ka nīnau maka no ka hāʻule ʻana o ka huelo mean derivative:
SELECT derivative(mean(“/cos/interfaces/interface/queues/queue/tailDropBytes”), 1s) MAI “buffer” WHERE (“mea hana”::tag = 'Leaf-1' A ME “/cos/interfaces/interface/@name”::tag = 'et-0/0/0' A me “/cos/interfaces/interface/queues/queue/@queue”::tag = '4') A me $timeFilter GROUP MA ka manawa ($__interval) fill(null)
Hoʻohana CPU
- E koho i ke kumu ʻikepili e like me ka hoʻāʻo-1.
- Ma ka ʻāpana FROM, koho i ke ana ʻo "newcpu"
- I ka WHERE, ʻekolu mau pā e hoʻopiha ai. E koho i ka mea hana::tag a ma ka tag koho i ka inoa hoʻokipa o ka hoʻololi (ie spine-1). A i loko o / component/component/properties/property/name:tag, a koho i ka cpuutilization-total AND ma ka inoa::tag koho RE0.
- Ma ka ʻāpana SELECT, koho i ka lālā sensor āu e makemake ai e nānā. I kēia hihia, koho i ka "field (state/value)".
ʻO ka nīnau makaʻala no ka ʻike ʻana i ka derivative non-negative o ka huelo hāʻule no nā hoʻololi he nui ma nā interface lehulehu i nā bits/sec.
E koho i non_negative_derivative(mean(“/cos/interfaces/interface/queues/queue/tailDropBytes”), 1s)*8 MAI “buer” WHERE (mea hana::tag =~ /^Spine-[1-2]$/) a me (“/cos/interfaces/interface/@name”::tag =~ /et-0\/0\/[0-9]/ a i ʻole “/cos/interfaces/interface/@name”::tag=~/et-0\/0\/1[0-5]/) A me $timeFilter GROUP MA ka manawa($__interval),mea hana::tag piha (null)
ʻO kēia kekahi o nā exampnā kiʻi i hiki ke hana ʻia no ka nānā ʻana i kahi pūnaewele AI/ML.
Hōʻuluʻulu manaʻo
Hōʻike kēia pepa i ke ʻano o ka huki ʻana i ka ʻikepili telemetry a me ka nānā ʻana iā ia ma ka hana ʻana i nā kiʻi. Kūkākūkā kūikawā kēia pepa e pili ana i nā mea ʻike AI/ML, ʻo ka mea maoli a me ka openconfig akā hiki ke hoʻohana ʻia ka hoʻonohonoho no nā ʻano mea ʻike āpau. Ua hoʻokomo pū mākou i nā hoʻonā no nā pilikia he nui āu e alo ai i ka wā e hana ana i ka hoʻonohonoho. ʻO nā ʻanuʻu a me nā hoʻopuka i hōʻike ʻia ma kēia pepa he kikoʻī ia i nā mana o ka waihona TIG i ʻōlelo ʻia ma mua. Hiki ke loli ma muli o ka mana o ka polokalamu, nā mea ʻike a me ka mana Junos.
Nā kuhikuhi
ʻO Juniper Yang Data Model Explorer no nā koho sensor āpau
https://apps.juniper.net/ydm-explorer/
Openconfig forum no nā mea ʻike openconfig
https://www.openconfig.net/projects/models/
Ke Keʻena ʻOihana a me nā Kūʻai
ʻO Juniper Networks, Inc.
1133 ʻAno Hou
Sunnyvale, CA 94089 USA
Kelepona: 888. JUNIPER (888.586.4737)
aiʻole +1.408.745.2000
Fax: +1.408.745.2100
www.juniper.net
Ke Keʻena APAC a me EMEA
ʻO Juniper Networks International BV
Alanui Boeing 240
1119 PZ Schiphol-Rijk
Amsterdam, ʻo Netherlands
Kelepona: +31.207.125.700
Fax: +31.207.125.701
Kuleana kope 2023 Juniper Networks. Inc. Ua mālama ʻia nā kuleana Ail. ʻO Juniper Networks, ka logo Juniper Networks, Juniper, Junos, a me nā hōʻailona ʻē aʻe he mau inoa inoa inoa inoa o Juniper Networks. inc. a / a i ʻole kona mau hui ma ʻAmelika Hui Pū ʻIa a me nā ʻāina ʻē aʻe. ʻO nā inoa ʻē aʻe he mau hōʻailona o ko lākou mau mea nona. ʻAʻole kuleana ʻo Juniper Networks no nā hemahema o kēia palapala. Loaʻa iā Juniper Networks ke kuleana e hoʻololi. hoʻololi. hoʻololi, a i ʻole e hoʻoponopono hou i kēia puke me ka ʻole o ka hoʻolaha.
E hoʻouna i nā manaʻo i: design-center-comments@juniper.net V1.0/240807/ejm5-telemetry-junos-ai-ml
Palapala / Punawai
![]() |
ʻO Juniper NETWORKS Telemetry In Junos no AI ML Workloads Software [pdf] Ke alakaʻi hoʻohana Telemetry In Junos no AI ML Workloads Software, Junos no AI ML Workloads Software, AI ML Workloads Software, Workloads Software, Software |