Midterm Reflection of Internship


2 months in

Today marks 60 days of my summer internship at Deloitte.

I am especially enjoying the social events and the possibility of working at the office at my own convenience.

In the past 2 months I have accomplished a few things which mostly involved learning about the fields of Data Engineering and Cloud Computing.

For the sake of clarity, I decided to split my experiences so far into 3 phases..

Phase 1: Training and Onboarding

This is where it all starts.

I received a work Laptop a week before my start date. I had to wait patiently for an email with instructions to get setup.

The first few weeks involved a lot of head scratching and an abundant amount of tasks to complete. New Hires were expected to fill forms and answer questions about Security, Compliance, Sexual Harassment, and Insider Information.

Additionally, I was introduced to the demanding worklow at Deloitte. Everybody needs to submit an internal Resume for managers to notce you and invite you to their projects.

However, there were definitely some amazing perks..

The company offered 500$ to all New Hires for upgrading their “work from home” setup. I spent 100$ on a mechanical keyboard and the rest on a 4K wide monitor.

Bonuses ranging from 1000$ to 1500$ were handed out for the completion of certifications. With the guidance of my hiring manager, I settled on learning about Microsoft Azure’s services and basic cloud computing concepts.

Certification #1 : AZ-900, Azure Fundamentals

For my second Certification, I decided to do more hands-on and learned about Spark, End-to-End ETL pipelines, data processing and structured streaming.

Certification #2 : Data Engineering Associate, Databricks

Phase 2: Finding a project

After joining Deloitte for one month, I finally have been approached for a project.

I am probably unable to disclose any precise information about the project since it is confidential.

I will attempt to describe what our team was working on and the general structure of the application we were building.

The application we were building was a SIEM cyber and financial threat monitoring tool. The application would use graph algorithms and machine learning to detect anomalies. This type of solution is called UEBA and is the current industry standard for cybersecurity services.

The project consisted of three main teams: data engineering, full-stack and cybersecurity. I was assigned to the cybersecurity team.

Data engineers were tasked to ingest data from the endpoints (users, servers, routers, etc.) and manipulate it in preparation for machine learning and analysis purposes.

Full-stack developers built the entire application which consisted of the frontend: an interface for IT analysts to review threats and write reports, and the backend: a combination of graph databases, server routing, caching and private APIs to create visualizations and allow data access for the frontend.

Finally, the cybersecurity team focused on creating use cases to figure out when exactly an endpoint has derived from its usual behavior. For example, when a user transfers 20 GB of data through an email to an external location. The main priority was to reduce false positives while maitaining a near zero count for false negatives.

So far, I have done research about the pros and cons of similar applications in the market using reports from the Gartner Magic Quadrant

Phase 3: Finding direction

I am currently in this phase. Unfortunately, my short tenure in the project has already been terminated due to pending internal conflicts.

I decided to take some time off to work on personal projects, and I will be reaching out to managers next week to find more projects to work on.

I have found a direction during my experiences so far. I thought the Spark engine to be very interesting and will definitely push to use it as a tool for my next work term. Apart from that, these blogs I am currently writing is also something I want to pursue.