Secure Boot certs expiring, etcd patching websocket auth, EKS gets rollback, and Linux 7.3 targets NVMe bottlenecks — solid infrastructure week.
// SECURITY FOCUS
Secure Boot certificate expiration: what actually breaks and when
The Microsoft UEFI CA cert that signed most Linux bootloaders expired recently — systems already booted won’t stop, but any reinstall, recovery USB, or PXE scenario hitting unupdated shim or grub binaries will fail Secure Boot validation. The window to keep using existing signed binaries is shorter than the expiration date implies because distros need time to push re-signed replacements through their chains. Fleet ops running automated reinstalls or bare-metal provisioning should verify their boot images are signed against the new cert before a recovery scenario forces the issue.
What to do: Audit your PXE/reinstall images and recovery media now; confirm your distro has shipped re-signed shim/grub binaries and update before the next provisioning run.
- Etcd v3.5.32 and v3.6.13 fix a websocket authentication bug — etcd · Jul 1
SIG-etcd released v3.5.32 and v3.6.13, patching dependency CVEs and fixing a websocket authentication bug where bearer-prefixed tokens caused valid requests to be rejected. Both releases bump to Go 1.25.11 and update OpenTelemetry to v1.43.0 to address CVE-2026-29181 and CVE-2026-39883; v3.6.13 also pulls in golang.org/x/crypto v0.52.0 for additional CVE coverage. A new write-only-skip-check value for –v2-deprecation is the most operationally interesting addition: it lets operators upgrading from v3.5 to v3.6 bypass the startup check that blocks etcd when non-membership v2 data is still present, buying time before write-only-drop-data becomes the default in v3.7. v3.5.32 also backports the non-admin maintenance Status endpoint access from v3.6.12. v3.4 is end-of-life and won’t receive these fixes, so anyone still on it needs to migrate. - Amazon EKS now lets you roll back a Kubernetes version upgrade within 7 days — AWS News Blog · Jul 1
Amazon EKS now lets you roll back a Kubernetes control plane upgrade within 7 days of the upgrade completing – a capability that open-source Kubernetes doesn’t offer natively. You get one minor version back at a time, matching the incremental upgrade model, and EKS runs pre-rollback cluster insights checks to flag node version mismatches or add-on dependency issues before you proceed. A `–force` flag skips those checks if you need to move fast. For EKS Auto Mode clusters, managed nodes roll back alongside the control plane while respecting pod disruption budgets, and a cancel API lets you abort a node rollback mid-flight if it’s taking too long. Control plane rollback took roughly 20 minutes in the author’s test – comparable to a standard upgrade. Available now in all commercial AWS regions at no added cost; you pay only normal EKS and compute charges. Worth noting: the 7-day window is a hard cutoff, so teams upgrading large cluster fleets need to build that validation period into their rollout schedule. - Linux 7.3 targets a “significant bottleneck” for small direct I/O on PCIe Gen5 NVMe — Phoronix · Jul 1
Bytedance engineer Fengnan Chang traced a significant bottleneck in 4K random read performance on PCIe Gen5 NVMe SSDs to memory allocations and state-machine overhead in the kernel’s IOmap direct I/O path. The fix introduces a simplified DIO path that bypasses the usual IOmap machinery when the request size is at or below the inode blocksize – covering the common small-I/O case on EXT4 and XFS, provided the inode isn’t encrypted. Testing showed a jump from 1.92M to 2.19M IOPS on Gen5 hardware, with up to 10% gains on EXT4 and XFS under io_uring at higher queue depths. The patch is queued in the VFS tree targeting Linux 7.3 later this year; if your stack is Gen5 NVMe with io_uring and small block workloads, this one’s worth tracking. - Flux 2.9 GA ships a CLI plugin system — Flux CD · Jun 30
Flux 2.9 GA ships a CLI plugin system (RFC-0013) that lets you install, pin, and version extensions independently of the core binary – two first-party plugins launch with it: Mirror (registry-to-registry sync for Helm charts, OCI artifacts, and images) and Schema (manifest validation against JSON schemas and CEL rules). Server-side apply gets a long-needed field ignore rule via Kustomization.spec.ignore, so Flux stops fighting HPAs and admission webhooks over fields they legitimately own. Other additions include SOPS decryption with the Age post-quantum cipher, Workload Identity auth for HashiCorp Vault and OpenBao (no more long-lived tokens), SSH key support for Git commit signing and verification, and OIDC-secured webhook Receivers. Two breaking changes to check before upgrading: the default Helm post-render strategy flips from nohooks to combined (chart hooks now included in post-rendering), and the v1beta2 image and notification APIs are fully removed – run flux migrate before upgrading or you’ll lose resources. Flux v2.6 is end-of-life. - Rust 1.96.1 released — Rust Blog · Jun 30
Rust 1.96.1 is out as a patch release fixing three CVEs in libssh2, which is compiled into Cargo – so this affects anyone running Cargo, not just code that explicitly uses libssh2. The article doesn’t detail the specific CVEs or any other fixes beyond that, so check the full release notes before triaging urgency. Update via `rustup update stable`.
// In other news
ai
- Have your agent record video demos of its work with shot-scraper video (Simon Willison) · Jun 30 — shot-scraper 1.10 adds a `video` command so agents can record screen walkthroughs as MP4 – useful for async debugging and demo generation without a separate capture tool.
- LLMs are stuck in a groupthink groove. This startup is trying to get them out. (MIT Technology Review AI) · Jul 1 — LLMs converge on predictable outputs (ask for a random 1-10, get 7) because RLHF optimizes for average approval – a startup is testing diversity-injection at inference time to break the pattern.
- Autoresearch: The feedback loop behind self-improving agents (Latent Space) · Jul 1 — Introspection’s Roland Gavrilescu details autoresearch loops where agents critique and re-run their own experiments, with humans gating only the final recipe selection.
- 🔬 The Coolest Diffusion Research Isn’t in LLMs — Evan Feinberg & Sergey Edunov, Genesis Molecular AI (Latent Space) · Jul 1 — Llama lead Sergey Edunov left Meta for Genesis Molecular AI after PEARL achieved zero-shot wins on OpenBind, arguing co-folding accuracy now unlocks drug design loops that weren’t viable two years ago.
- Featuring Every Eval Ever Results on Hugging Face Model Pages (Hugging Face Blog) · Jun 30 — Hugging Face now surfaces community evaluation results directly on model pages, making it easier to compare third-party benchmarks alongside official ones without leaving the hub.
cloud
- Your site, your rules: new AI traffic options for all customers (Cloudflare Blog) · Jul 1 — Cloudflare now lets all customers split AI traffic into crawler vs. agent buckets and set separate rules for each, replacing the old all-or-nothing bot block.
- Scaling LLM Inference: Multi-Node KV Cache Offloading with GKE & Managed Lustre (Google Cloud Blog) · Jul 1 — GKE + Managed Lustre used to offload KV cache across nodes during LLM inference, letting teams scale past single-node GPU memory limits without custom networking.
- Preventing data exfiltration in machine learning environments with Amazon SageMaker AI (AWS Architecture) · Jun 29 — Three-layer SageMaker exfiltration defense – VPC endpoints plus WorkSpaces Secure Browser – blocks data leaving the training environment without killing researcher UX.
- Dual-token authentication for Nakama game servers with Amazon Cognito on AWS (AWS Architecture) · Jun 29 — Dual-token Cognito setup for Nakama validates SRP-authenticated JWTs in a Go runtime hook, letting game clients auth without exposing a client secret.
- Unmasking the crawls with Attribution Business Insights (Cloudflare Blog) · Jul 1 — Cloudflare’s Attribution Business Insights dashboard names individual crawlers and estimates their content appetite, giving site owners data to back crawler-pricing conversations.
culture
- How Kent Beck shapes the software engineering industry (Pragmatic Engineer) · Jul 1 — Kent Beck argues the core TDD loop – fast feedback, small steps, trust-building – stays relevant as AI generates code, because the discipline is about confidence, not keystrokes.
- Why your AI bill is bigger than it should be (LeadDev) · Jul 1 — Oversized prompts, uncached repeated context, and unthrottled retries are the main drivers of inflated LLM bills – token hygiene is now a measurable engineering concern, not a finance footnote.
iac
- HCP Terraform Powered by Infragraph Limited Availability Launch (HashiCorp Blog) · Jun 30 — HCP Terraform’s Infragraph LA builds a live dependency graph across hybrid and multi-cloud estates, targeting drift detection and blast-radius analysis that static state files can’t provide.
- GitLab Patch Release: 18.8.11 (GitLab) · Jul 1 — GitLab 18.8.11 patch is out – self-hosted operators on the 18.8 branch should review the changelog and apply before the next security window.
- Fully Automated AI Inference on AWS, Azure, and Google Cloud with Pulumi (Pulumi Blog) · Jun 30 — Pulumi walkthrough deploys Ollama on GPU instances across AWS, Azure, and GCP with a single program, covering instance type selection, networking, and teardown in one codebase.
k8s
- Understanding dynamic resource allocation in Kubernetes (CNCF Blog) · Jul 1 — DRA reached GA in Kubernetes v1.35 and NVIDIA moved their dra-driver-nvidia-gpu into Kubernetes itself, making GPU resource scheduling via structured parameters the new supported path.
- Support for Istio 1.28 has ended (Istio) · Jul 1 — Istio 1.28 is now fully EOL – no further security backports; any cluster still running 1.28 needs an upgrade path planned now.
linux
- Glibc Introduces /etc/tunables.conf For System-Wide Tunables (Phoronix) · Jul 1 — Glibc gains /etc/tunables.conf so sysadmins can set GLIBC_TUNABLES system-wide without patching env vars into every service unit.
- Asahi Linux Fixes Booting With macOS 27, Progress On M3 & Apple Video Decode (Phoronix) · Jul 1 — Asahi Linux patches boot regression introduced by macOS 27 and lands early Apple Video Decode support for M3, though M3 keyboard input still needs an out-of-tree driver.
- GCC 16.2 Being Planned For Early August Release (Phoronix) · Jul 1 — GCC 16.2 targets early August with backported bug fixes for those holding off on 16.x until a stable point release.
- Mageia 10 released (LWN.net) · Jun 29 — Mageia 10 ships with Linux 6.18, DNF 5.4, and RPM 4.20.1 while raising the x86 32-bit hardware floor – effectively nudging remaining 32-bit installs toward EOL.
- ASUS ROG Strix Laptop Sees Driver Fix For Linux Performance Too Low Compared To Windows (Phoronix) · Jul 1 — ASUS ROG Strix G16 Linux driver fix resolves platform/WMI misconfiguration that capped GPU performance well below Windows levels on the same hardware.
obs
- Datadog acquires Adaptive ML (Datadog Blog) · Jun 30 — Datadog acquired Adaptive ML, whose platform trains and deploys specialized AI agents – likely the foundation for Datadog’s next-gen anomaly and root-cause models inside the platform.
- Debug and evaluate your AI app from your coding agent with Datadog Agent Observability (Datadog Blog) · Jun 30 — Datadog’s Agent Observability integration lets a coding agent pull LLM trace data, classify failures by type, and generate targeted fixes without leaving the IDE context.
- 5 pitfalls to avoid when measuring DevEx in the AI era (Datadog Blog) · Jun 30 — Datadog engineering flags 5 DevEx measurement traps in AI-assisted teams – chiefly confusing AI tool adoption rates with actual cycle-time or defect-rate improvements.
sec
- Papa Johns Surveillance-Based Advertising (Schneier on Security) · Jul 1 — Papa Johns is cross-referencing purchase history across retailers to predict household food depletion and time pizza ads accordingly – no breach, just retail surveillance as a service.
- Secure Amazon container workloads using container attribute-based rules in AWS Network Firewall (AWS Security) · Jul 1 — AWS Network Firewall now supports container attribute-based rules for EKS and ECS, letting you write firewall policy against pod/task labels rather than IP ranges that rotate constantly.
- How to use the AWS Workload Credentials Provider for cross-account secret retrieval and prefetching secrets (AWS Security) · Jul 1 — AWS Workload Credentials Provider adds cross-account Secrets Manager retrieval and secret prefetching, cutting round-trips for latency-sensitive workloads that pull secrets on the hot path.
web
- Worker Metrics on the WorkerStopping Event in Laravel 13.18 (Laravel News) · Jul 1 — Laravel 13.18 adds worker metrics on the WorkerStopping event, letting you instrument queue worker lifecycle and capture per-run stats before the process exits.
- Your Laravel routes can carry metadata now, and Flare shows it (Spatie) · Jul 1 — Laravel’s new route metadata API lets you attach arbitrary key-value data to route definitions and read it back at runtime – Flare now surfaces it in error reports for faster triage.
- WordPress 7.0.1 RC1 is now available (Make WordPress Core) · Jul 1 — WordPress 7.0.1 RC1 is out for testing – bug-fix only release, no feature changes, worth validating against staging before the stable drops.
- What’s new in Gutenberg 23.5? (July 1, 2026) (Make WordPress Core) · Jul 1 — Gutenberg 23.5 ships biweekly editor improvements; worth scanning the changelog if you maintain block themes or custom block extensions.
Patch your boot images, check your etcd auth, and have a good week.

Leave a comment