#mlops — Tagged Stories | WeSearch Press

Every story in the WeSearch catalog tagged with #mlops, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

19 stories tagged with #mlops, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Mlops"

RELATED TAGS

#ai7 #devops6 #machinelearning5 #llm4 #infrastructure4 #monitoring3 #kubernetes2 #platformengineering1 #governance1 #todd-linnertz1 #ai-dev-261 #payroll-team1

DEV.TO (TOP)

LiteLLM vs OpenRouter: I Used Both. Here's Where Each One Actually Broke.

LiteLLM vs OpenRouter isn't a close call, they're solving different problems. I ran both in production before understanding that. Here's the honest breakdown of what each does well…

35 views · Fri, 26 Jun 2026 06:30:00 GMT

#ai #devops

DEV.TO (TOP)

Handling Failure: The Most Important Part of AI Systems

Every AI system will fail. The question isn't whether it will happen. The question is: What...…

35 views · Fri, 29 May 2026 15:08:10 GMT

#ai #machinelearning

DEV.TO (TOP)

Virtual keys per tenant: ditching our custom LLM billing layer

TL;DR: We had 11,247 lines of Python middleware handling per-tenant LLM cost attribution, rate...…

33 views · Wed, 27 May 2026 16:02:19 GMT

#llm #infrastructure

DEV.TO (TOP)

AI Observability: Stop Flying Blind in Production

You shipped your AI feature three months ago. Users love it. Usage is growing. But when someone asks...…

36 views · Wed, 27 May 2026 11:21:37 GMT

#ai #monitoring

DEV.TO (TOP)

LLM-as-judge variance broke our DPO training signal for 3 weeks

TL;DR: Our DPO pipeline used a single LLM as the preference judge. Training reward climbed every run....…

30 views · Wed, 27 May 2026 06:31:57 GMT

#machinelearning #llm

DEV.TO (TOP)

Capping VLM spend per CV researcher: hierarchical budgets in practice

TL;DR: Our 11-person CV team at Prophesee was burning through €3-4k weeks of VLM spend on dataset...…

34 views · Tue, 26 May 2026 16:52:17 GMT

#machinelearning #computervision

DEV.TO (TOP)

Token-level eval harness for tool-calling agents: what we wired up

TL;DR: We replaced our "did the agent finish the task" pass/fail eval with a token-level harness that...…

35 views · Tue, 26 May 2026 16:03:35 GMT

#machinelearning #devops

DEV.TO (TOP)

Prefix caching in vLLM under multi-tenant agent traffic

TL;DR: We turned on vLLM's prefix cache for our agent workloads at Nexus Labs and watched TTFT drop...…

31 views · Tue, 26 May 2026 06:35:20 GMT

#infrastructure #pytorch

DEV.TO (TOP)

How to Detect GPU Waste in a Kubernetes Cluster

GPU waste in Kubernetes does not announce itself. Your cluster shows healthy utilization. Your...…

30 views · Mon, 25 May 2026 19:27:09 GMT

#kubernetes #gpu

DEV.TO (TOP)

Why 91% of AI Agents Fail in Production (And What the 9% Do Differently)

Everyone is building AI agents right now. Autonomous systems that reason, plan, and act without...…

27 views · Sat, 23 May 2026 14:29:16 GMT

#ai #production

DEV.TO (TOP)

llm-nano-vm v0.8.0 — deterministic FSM runtime for LLM pipelines, now with output validation and per-step timeouts

PyPI: pip install llm-nano-vm GitHub: http://github.com/Ale007XD/nano_vm MCP gateway:...…

35 views · Sat, 23 May 2026 04:36:37 GMT

#backend #opensource

DEV.TO (TOP)

I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and Operationally Credible

I abandoned this Kubernetes platform on April 4th. 48 days later I rebuilt it: CrashLoopBackOff everywhere → self-service GitOps, policy enforcement, and deterministic recovery. 21…

29 views · Sat, 23 May 2026 00:42:39 GMT

#ai #devops

DEV.TO (TOP)

Inference Routing Is Becoming an Infrastructure Placement Problem

The request arrives. The model answers. For most teams, everything in between is invisible — a...…

29 views · Thu, 21 May 2026 12:14:17 GMT

#infrastructure #cloudarchitecture

DEV.TO (TOP)

Detecting Silent Model Failure: Drift Monitoring That Actually Works

TL;DR: Most drift monitoring setups alert on the wrong thing. Feature distribution drift is cheap to...…

28 views · Wed, 20 May 2026 06:55:39 GMT

#machinelearning #infrastructure

DEV.TO (TOP)

When AI Meets Reality: Why “Hello World” Isn’t Enough for LLM Systems

Most AI tutorials stop at “Hello World.” You wire up a model, send a prompt, get a response, and feel...…

29 views · Tue, 19 May 2026 05:11:16 GMT

#ai #llm

DEV.TO (TOP)

KubeCon Amsterdam 2026: The Industrialization of ML - A Deep Dive into Uber’s AI Platform Architecture.

This article serves as a technical follow-up to our KubeCon 2026 coverage, providing a comprehensive...…

33 views · Sun, 17 May 2026 08:23:47 GMT

#machine learning #kubernetes #ai platform

DEV.TO (TOP)

The Agent Is 20% of the Work. The Platform Is the Other 80%.

A payroll agent hit 94% accuracy in testing and dropped to 70% in production. What closed the gap had nothing to do with the model. Here's what that means for every enterprise team…

37 views · Sun, 17 May 2026 04:56:38 GMT

#ai #platformengineering #devops

R/LEARNPROGRAMMING

Mlops coverage.

LiteLLM vs OpenRouter: I Used Both. Here's Where Each One Actually Broke.

Handling Failure: The Most Important Part of AI Systems

Virtual keys per tenant: ditching our custom LLM billing layer

AI Observability: Stop Flying Blind in Production

LLM-as-judge variance broke our DPO training signal for 3 weeks

Capping VLM spend per CV researcher: hierarchical budgets in practice

Token-level eval harness for tool-calling agents: what we wired up

Prefix caching in vLLM under multi-tenant agent traffic

How to Detect GPU Waste in a Kubernetes Cluster

Why 91% of AI Agents Fail in Production (And What the 9% Do Differently)

llm-nano-vm v0.8.0 — deterministic FSM runtime for LLM pipelines, now with output validation and per-step timeouts

I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and Operationally Credible

Inference Routing Is Becoming an Infrastructure Placement Problem

Detecting Silent Model Failure: Drift Monitoring That Actually Works

When AI Meets Reality: Why “Hello World” Isn’t Enough for LLM Systems

KubeCon Amsterdam 2026: The Industrialization of ML - A Deep Dive into Uber’s AI Platform Architecture.

The Agent Is 20% of the Work. The Platform Is the Other 80%.

MLOPS AND LLMOPS BUDDY

Need Mlops advise

Browse more