profile pic # Data Engineering @ Google
Upvote 0 Downvote
Integrating Data from Multiple Systems Data Engineer @ Google Difficulty Hard

Describe how you would integrate data from multiple systems. Explain the steps involved, the technologies you would use, and any challenges you might encounter.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Differences Between SQL and NoSQL Databases Data Engineer @ Google Difficulty Medium

Discuss the differences between SQL and NoSQL databases and when to use each.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Database Design Question Data Engineer @ Google Difficulty Hard

Design a database schema for an online bookstore. The schema should include tables for books, authors, customers, and orders. Describe the relationships between the tables and any constraints you would apply.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Querying and Transforming Data Points Data Engineer @ Google Difficulty Medium

Given a set of data points, write SQL queries to perform various transformations. Suppose you have a table sales_data with columns sale_id, product_id, sale_date, quantity, and price. Write SQL queries to:

  1. Calculate the total sales amount for each product.
  2. Find the average quantity sold per product per day.
  3. Identify the top 3 best-selling products by total sales amount.
  4. Transform the sale_date to include only the month and year.
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
SQL Query to Retrieve Data Using Joins Data Engineer @ Google Difficulty Medium

Write a SQL query to retrieve data from multiple tables using joins. Suppose you have three tables: customers, orders, and products. The customers table has columns customer_id, name, and email. The orders table has columns order_id, customer_id, order_date, and total_amount. The products table has columns product_id, order_id, product_name, and price. Write an SQL query to retrieve the customer's name, email, order date, product name, and price for all orders.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
ACID Properties and Their Significance in Database Transactions Data Engineer @ Google Difficulty Medium

Discuss ACID properties and their significance in database transactions.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Processing Large Data Sets Using Apache Spark Data Engineer @ Google Difficulty Hard

You have a large dataset stored in a distributed file system like HDFS, and you need to perform complex transformations and aggregations. Explain how you would use Apache Spark to process this dataset. Provide an example of a Spark job that calculates the average value of a specific column.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Design a Database Schema for a Blogging Platform Data Engineer @ Google Difficulty Hard

Design a database schema for a blogging platform. The schema should include tables for users, posts, comments, and tags. Describe the relationships between the tables and any constraints you would apply.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Design a Database and Summarize Sales Data Data Engineer @ Google Difficulty Hard

Design a database from scratch to store and summarize sales data. Include tables for products, customers, orders, and order details. Explain how you would design the schema, summarize sales data, and predict future sales.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Building a Scalable Data Engineering Solution for YouTube Data Engineer @ Google Difficulty Hard

Describe the types of technologies and architecture you would need to build a scalable data engineering solution for a platform like YouTube. Focus on data ingestion, storage, processing, and analytics.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Complex Query for Analyzing Customer Orders and Product Sales Data Engineer @ Google Difficulty Hard

Given the following database schema for an e-commerce platform:

Customers Table:

  • customer_id (Primary Key)
  • first_name
  • last_name
  • email

Orders Table:

  • order_id (Primary Key)
  • customer_id (Foreign Key)
  • order_date
  • total_amount

OrderDetails Table:

  • order_detail_id (Primary Key)
  • order_id (Foreign Key)
  • product_id (Foreign Key)
  • quantity
  • price

Products Table:

  • product_id (Primary Key)
  • product_name
  • category

Write a complex SQL query to retrieve the top 5 customers who have spent the most on products in the 'Electronics' category over the past year. The query should also return the total amount spent by these customers and the number of orders they placed in the past year.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Designing a Data Analytics Pipeline for E-commerce Platform Data Engineer @ Google Difficulty Hard

Design a data analytics pipeline for an e-commerce platform. Describe the key components, technologies, and processes involved from data ingestion to reporting and analytics.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Complex SQL Query Using Joins, Unions, Window Functions, and Subqueries Data Engineer @ Google Difficulty Hard

Given the following database schema for a retail platform:

Customers Table:

  • customer_id (Primary Key)
  • first_name
  • last_name
  • email

Orders Table:

  • order_id (Primary Key)
  • customer_id (Foreign Key)
  • order_date
  • total_amount

OrderDetails Table:

  • order_detail_id (Primary Key)
  • order_id (Foreign Key)
  • product_id (Foreign Key)
  • quantity
  • price

Products Table:

  • product_id (Primary Key)
  • product_name
  • category

Write a complex SQL query that performs the following tasks:

  1. Use a JOIN to retrieve the customer's name and email along with the total amount they spent on each order.
  2. Use a UNION to combine the result with a similar query that retrieves the total number of products ordered by each customer.
  3. Use a WINDOW FUNCTION to rank customers based on their total spending.
  4. Use a SUBQUERY to find customers who have placed more than 5 orders.

The final result should include the customer's name, email, total amount spent, total products ordered, and their spending rank.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Get Top Ten Data from Last Column of CSV File Using Python Data Engineer @ Google Difficulty Medium

How do you write a Python program to get the top ten data entries based on the last column from a comma-separated flat file (CSV)?

Solution:

Please sign-in to view the solution