Warning Signs of an Outdated Telemetry Stack

standard

If you were to walk into a busy architecture firm 50 years ago, you would have found a completely different picture from today’s world. Back then, the average firm was filled with teams of draftsman stretched across giant sheets of paper, meticulously sketching plans with pencils, erasers, and T-squares.

If a change order came through, those teams had to throw everything out and re-draw their specs by hand. It was tedious work that was performed manually, requiring the use of rudimentary tools and countless man hours. There was also little margin for error.

But with the invention of AutoCAD software, architects and engineers were better able to visualize their plans and easily make changes thanks to digital drafting. This software also introduced standardized conventions and templates, which led to faster documentation, quicker iteration, and it drastically reduced human error.

‍

Draftsmen sketching architectural plans before the invention of the AutoCAD system

‍

AutoCAD software was invented in 1982 and it completely revolutionized how we design and construct everything — from family homes to skyscrapers. And yet today, machine builders are still operating like it's the 1960s, manually reviewing their data and patching together existing systems to create some semblance of a telemetry stack.

‍

What is a telemetry stack?

A telemetry stack is a set of tools that helps you organize, understand, and troubleshoot data produced by complex machines. Managing and analyzing the deluge of data that comes from these machines is crucial for mission success. Without it, you may find it impossible to scale — or it may take much longer than it would have otherwise.

If you’re building machines, you might already have some version of a telemetry stack, cobbled together by legacy systems that don’t work well together. In this scenario, your propulsion team, GNC, and flight software teams are all using their own unique tools to complete their analysis, limiting visibility and exposing you to risk.

NASA team members waiting for a telemetry readout during the VISIONS mission

Combining advanced tools like MATLAB and Jupiter might give you some limited features for analysis, but this way of doing things fails to prevent bottlenecks where only a few key engineers hold important specialized knowledge. It also doesn’t allow engineers to easily share data or collaborate internally. Imagine trying to quickly troubleshoot an anomaly while sorting through email, Slack/Teams, and PowerPoint discussions, while waiting for large data files to load. You certainly won’t diagnose anything fast.

quote-left

Without the right tools, mistakes not only become more likely, but also more devastating.

standard

Some companies are asking engineers to try and build this complex stack in-house — often spending more time and money building less capable alternatives that only solve limited problems. Building this kind of system in-house diverts meaningful engineering time and funding from core engineering challenges, which inevitably leads to mistakes and (occasionally) critical disasters.

Even if you succeed in getting your in-house system up and running, you’ll quickly outgrow it. Your engineers will be too busy scaling mission-critical databases instead of building machines, or manually running python scripts in data review. When you’re creating fleets of complex machines, even tiny mistakes can become massive if you miss them. Without the right tools, mistakes not only become more likely, but also more devastating.

‍

Do you have an observability problem?

There are a few simple signs that indicate it’s time to upgrade your telemetry tools. For machine builders, the process of creating new software releases is a common sore spot. This is one of those areas where you can easily introduce risk to your production hardware if you’re not careful.

Companies with an observability problem acutely feel the trade-off between creating frequent releases, and burning out their engineers with tedious data review. If you’re creating a high number of releases, your team will also spend a high proportion of its time manually reviewing data with a low signal-to-noise ratio. As fatigue sets in, bugs will be missed.

quote-right

Companies with an observability problem acutely feel the trade-off between creating frequent releases, and burning out their engineers with tedious data review.

standard

On the other hand, if you decide to create fewer releases, each release will encompass more software changes. With such a long list of changes, it will be difficult to root cause any bugs that you find due to a ballooning amount of updates. Either way, you’re left with the same level of risk.

‍

How releases become painful

Why are releases such a common pain point throughout the industry? Without the help of advanced tools, this is what the release process looks like:

Your engineers are manually reviewing data, looking through endless dashboards, putting in late nights and long hours to do something that should be automated. This is where mistakes can happen, where tiny details get missed, and where you’re at greater risk for staff attrition.
Your engineers need to write SQL queries and Python scripts just to visualize their data. Exploring the data requires an inordinate amount of effort, and bogs down investigations with unnecessary steps.
Tasks are becoming bottlenecked, because you’re relying on only one or two key engineers who possess the knowledge you need.
You’re struggling to share large data files, and collaboration happens externally through meetings and email chains.

‍

What does a modern release process look like?

With Sift’s advanced observability tools, your releases become simpler and much more straightforward:

Data from your test and production assets streams to a centralized data lake.
All the tests to verify the release can be quickly found
Every user in your organization — regardless of database or scripting ability — can point and click to visualize and explore the performance of their hardware.
All of your data review checks are embedded into a rules engine that constantly evaluates incoming data.
Anomalies are automatically flagged for your team to review.
With an easy to explore unified data source, engineers can quickly root cause anomalies due to test environment, hardware, or software causes.
Your team can refine its rules further to increase the signal-to-noise ratio even higher.
Once all of your anomalies are resolved, you can confidently release software to production.

‍

A new era of machine innovation

It doesn’t matter how difficult your mission is, data review should not be taking up an inordinate amount of your time, and releases should be painless. Manually reviewing your data is essentially akin to an architect hiring a draftsman to draw up plans instead of using AutoCAD. With Sift, you can data review an entire space station in under a day.

quote-left

The complex machines of today require updated systems that are scalable, easy to use, and powerful.

standard

We started this company because we’ve seen how mission-critical telemetry and data review tools are, yet they can be a drag rather than an accelerator. Instead of trial and error, it should be trial and measure. Learn. Improve. Progress.

The complex machines of today require updated systems that are scalable, easy to use, and powerful. At Sift, we’ve built a full observability stack for data ingestion/storage, visualization and automated data review for complex hardware development and implementation. Our tools are already being deployed by leading engineering teams who are pushing the limits of what we can achieve in space and here on earth.

By utilizing the best tools available for machine data, Sift helps your company find anomalies faster, automate data review, and scale without breaking. And with Sift, releases are painless, which means they never get behind schedule.