Mission Critical
Build vs. Buy
April 23, 2024

Evolving Past the In-House Model

In our previous article “Avoiding the In-House Sinkhole,” we talked about some of the general ways in which in-house systems can drain money, time, and resources from a project, hampering its ability to scale. Not all in-house systems struggle to get off the ground, but many of the most common infrastructures encounter similar pain points.

Even if things are going well, the total cost of your in-house solution is always going to be higher than a multi-tenant solution like Sift. When you amortize the cost of compute among multiple users, you get better features, a lower cost of IT, and still have 100% ownership and control of your data.

You also get a team of dedicated engineers that haven’t been reallocated away from your core goals. You can scale your platform, always have it utilized, and your data will never be accessible to others. But if you’re on an in-house system as the only user, you’re paying for all of the compute even when you’re not using it.

Let’s look at some of the most common infrastructures companies tend to build in-house.

Aviation system

This type of infrastructure relies on streaming low-frequency data to a ground station and transferring high-frequency data to an external hard drive. The low-frequency data is sent to an ephemeral telemetry viewer, where viewing is all it’s good for.

The high-frequency data that you really need can only be captured when the vehicle lands, where it is transferred to a hard drive, then uploaded to an S3 bucket and converted into different versions of data — which are all written in different formats. This means you have to use Athena for query on a local notebook, which gets expensive. Each test involves around 20 notebooks for different teams and scenarios, and they aren’t collaborative.

The process to make this transfer is not easy or instant. From the moment the vehicle lands to to properly viewing data it takes at least 4 hours, with teams of engineers working on the transfer.

Sift solves this process with real-time review, collaboration, and interactive viewing. By providing unified visualization and a user-friendly interface, Sift gives multiple teams the ability to collaborate with each other – without using 20 different notebooks. These teams can view and share graphs easily and annotate anomalies without manually comparing timestamps.

Satellite system

Satellite in-house systems often rely on a different infrastructure that’s based on Timescale Writers. The biggest challenge with Timescale is excessive compute that’s difficult to maintain.

First, downlink data is sent to a ground station, then piped from the ground services software into RabbitMq, where it can then be fanned-out to be picked up by other software. A Timescale Writer service will write it into the Timescale DB for persistence. Apache Flink (or a similar application) is used to read the incoming data for real-time analysis and alerting.

Engineers typically encounter schema challenges with Timescale, since it has a limited feature set, and granular retention policies that exist only at the bucket level. It’s possible to bolster the granular retention but you’d need a DBA skillset (someone who can write SQL). But why keep an index of all of your data if you don’t need to look at most of it?

With this system, you can have anywhere from a dozen to hundreds of open Flink jobs in order to manage different assets of channels that have different latency. This involves processing the data against various rules and conditions, while dealing with data that’s arriving out of order or late. It adds a considerable amount of coding and configuration complication, along with figuring out which schema to use. It also needs to run in real-time in order to create annotations and alerts.

Meanwhile, to view your data on this type of infrastructure it’s common to use a monitoring tool like Grafana – which may be limited in capability. Timescale Writers, Flink, and Grafana all must be running fast and in-sync in order to accurately observe the vehicle’s data.

The adaptable, scalable model

Sift eliminates the need for Timescale Writers and monitoring tools like Grafana – which typically make up around a third of your cloud bill. Sift can run data as real-time as possible, and can handle it when it doesn’t come in organized or on-time.

With advanced ingestion, visualization, and automatic data review, Sift does the work of three (or more) applications without messy conversions and code-writing engineers. Sift cuts out latency and normalizes data across multiple formats, enabling quick decision-making and avoiding bottlenecks. By mapping data streams to appropriate schemas, it gives you the freedom to focus on telemetry values without getting bogged down processing data.

With advanced ingestion, visualization, and automatic data review, Sift does the work of three (or more) applications without messy conversions and code-writing engineers.

While these common in-house infrastructures might work, they require a higher degree of compute and many more resources than a multi-tenant approach. And these infrastructures often only work for one specific type of vehicle. If your company scales beyond that vehicle to make other, different machines, you’ll likely need to re-tool your system.

Sift provides an adaptable system that enables you to scale – regardless of whether you’re sending a rocket into space or automating a fleet of rail vehicles. Instead of complicating the process with multiple applications that must run smoothly together, you can reduce the risk of your infrastructure failing and your machine breaking.