Data Engineering: Is It the Right Field for You?

Onramp is excited to announce our acquisition of Edlyft’s Intern Development Program to strengthen our mission and extend our impact.

Imagine that you wake up in the morning and check your sleep score on a wellness app on your phone. You get up and go for a jog, logging your pace and mileage on a fitness app. While eating breakfast, you place a grocery delivery order on a third app, check your spending against your budget on a fourth app, and search for flights for an upcoming trip on a fifth app. Consider all the pieces of data you’ve just produced… by 10 am. 

Think of each of those apps as a data stream, all flowing into a pool (or a “data lake”) of information about you… your habits, your likes, your dislikes, your accomplishments, and your goals. If you wanted to analyze your exercise habits, that would be easy enough, right? Simple line graph, all the data is in one app in the same format.

But what if you wanted to stream multiple threads of data together to create a more complete picture of your morning and help you plan a better routine? Do you run faster after a good night’s sleep? And does your grocery order fluctuate in your budget, and is it based on upcoming travel? How does what you ordered two weeks ago affect your morning run, and could it predict your pace six weeks from now? Should you order a different brand of cereal if you’re training for a 10K? Now you’ve got five apps funneling five types of data into a big pool, and for an added challenge, some of that data is in a PDF, some is a giant CSV, some is on a Google spreadsheet, and some of it, inexplicably, is written in Cyrillic, just for fun. 

This is one fairly simplistic example of the types of complex problems a data engineer might solve. 

What data engineers actually do

A data engineer’s job is to digest all the relevant streams of data (often called a data pipeline) and sort, organize, standardize and store that data in a way that data scientists can actually use. “A data scientist is only as good as the data they have access to,” says the online learning platform DataQuest, and that’s why data engineering is essential to the future of business. It is estimated that the world will have created and stored 200 Zettabytes of data by the year 2025.  

Back to our example with your epic morning; imagine that instead of those five data sources from your five apps, you have thousands of data sources with different formats, with millions of individual pieces of data, flying at you at a pace that, to the human brain, is impossible to conceive. Now your job is not synthesizing five differently-formatted spreadsheets from your apps but designing, building, and maintaining data “pipes” that funnel truly massive quantities of information into something your data scientist colleagues can use to illustrate insights about your business. You’re responsible for untangling the streams, cleaning the pipes, and organizing the data lake so that when the data scientists go in with a query, a clear and concise answer can emerge.

What would I code in?

So which tools would you use as a data engineer? It depends! You might see tools like Hadoop and Spark or languages like Python, R, Scala, or AWS Glue at Vanguard, but there are many options out there. Data engineers think about how to collect the data, how to store it, how to optimally join it together, and how to maintain these business-critical functions. You might work with different monitoring tools and implement site reliability practices as well. Think of yourself as deep in the bowels of a machine, making sure that what shows up on the surface is usable, accurate, and clear. 

Is data engineering right for me?

If you’re the kind of person who enjoys finding the right tool for the job, stepping back and looking at the big picture, or designing complex systems that optimize for efficiency, data engineering might be for you. 

On a personal level, says Sonny Kwok, a data engineer at Vanguard, “If you like to quantify your own decisions using data, data engineering is a good field for you.” Think back to our morning routine; if digging in and figuring out how to piece together all that data to improve your outcomes sounds fun (and it doesn’t to everyone!), data engineering might be right for you. 

If you derive satisfaction from working on a team, getting your colleagues exactly the right information in exactly the right way, and helping your business grow by leveraging the power of giant streams of information, data engineering might be right for you. 

If instead of being intimidated by the massive scale of the data hurtling towards us (and it’s only getting worse), you’re excited to try to get your arms around it, data engineering sounds like just the right fit. Continues Sonny, “In data engineering, there is always a better way to do things, so the learning never stops.”

Are you interested in pursuing a career as a Data Engineer? Onramp launches our Data Engineering apprenticeship in partnershop with Vanguard on Monday, 8/15/22. You can find more information on the Onramp site, including the role page, training plan, and application. The deadline for submission is 7 pm PT on Tuesday, 8/30/22.

Written by
Dana Breen

Get Started Today!

To get started, fill out a 3 minute form

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.