What’s the difference between a data architect, a data engineer, a data analyst and a data scientist?

Data engineer, data architect, data analyst….Over the past years, new data jobs have gradually appeared on the employment market. Nowadays, there are so many of them that it might sound confusing to you. Here are a few short definitions, so that you understand who does what.



data engineer: The data engineer gathers and collects the data, stores it, does batch processing or real-time processing on it, and serves it via an API to a data analyst/scientist who can easily query it. He provides the consolidated Big data to the data analyst/scientist, so that the latter can analyze it. Skills: data structures, relational (e.g. PostgreSQL) and NoSQL (e.g. Cassandra, Redis) databases, networks.

Data architect: The data architect is an advanced data engineer – he has more experience. He builds the BD architecture (Hadoop clusters…) and manages all the servers. He has a holistic vision of the team/company’s architecture and knows about the latest databases etc. Skills: data architects should have a solid knowledge in: databases architecture, data structures, Hadoop, Cassandra, Spark, MapReduce, Storm, Kafka, and/or other big data technologies and their capabilities and a passion for working with relational (e.g. PostgreSQL) and NoSQL (e.g. Cassandra, Redis) databases, networks.

Data analyst: The data analyst is in charge of making sense of the data thanks to monovariate or multivariate analyses and to vizualisations. They give a clearer understanding of the existing data. Skills: data analysts should be very agile in tools such as Microsoft Office (mainly, Excel and Powerpivot), Tableau and other visualization tools.

Data scientist: Instead of mostly focusing on modeling data interactions, the data scientist makes predictions on these interactions, to give a better understanding of how the data might evolve in time.
Skills: Data scientists should have a strong statistical background as well as good knowledge of business applications and computing.



Image via Data Science 101