profile pic Coding
Upvote 0 Downvote
Diagnosing Latency Issues in a Java Application DevOps Engineer @ Yahoo Difficulty medium

If there is a long latency for a Java application running on a server, what could be the possible causes and how would you diagnose and resolve them?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Implement CDN Service DevOps Engineer @ Google Difficulty hard

You are tasked by a Cloud Provider to create a CDN Product similar to Cloudflare. Design the architecture for the CDN service, focusing on content caching, load balancing, and global content delivery.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Distributed File Storage System DevOps Engineer @ Google Difficulty hard

Design a distributed file storage system similar to Google Drive or Dropbox, capable of storing and retrieving large amounts of data across a distributed network. Consider the following aspects:

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Implementing a Caching Proxy Server DevOps Engineer @ Netflix Difficulty medium

How would you implement a caching proxy server? Describe the architecture, key components, and the steps involved in building it.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Solving the Stairway Problem with Dynamic Programming DevOps Engineer @ Meta Difficulty hard

What is an efficient dynamic programming approach to solve the classic stairway problem using Python that optimizes both time and space complexity?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Implementing Sort and Removing Duplicates from Three Lists DevOps Engineer @ Meta Difficulty medium

How can I efficiently sort and remove duplicates from three different lists in Python, ensuring a single, sorted list as output?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Solving Array Operations Problem DevOps Engineer @ Meta Difficulty medium

How can I efficiently perform a series of operations on an array to solve a given problem as part of a DevOps automation script, considering constraints like execution time and resources?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Merging CSV Files with Standard Input DevOps Engineer @ Meta Difficulty medium

How can I efficiently merge two sets of CSV files using standard input, where the first line in each file identifies the dataset, ensuring a seamless data integration process?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Integrating Java Collections with SQL Databases for Backend Systems DevOps Engineer @ Meta Difficulty medium

How can Java collections be effectively integrated with SQL databases to enhance backend systems' performance and scalability?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Finding the 95th Percentile of URL Sizes Data Engineer @ Yahoo Difficulty hard

Given a dataset of 2 billion URLs and their sizes, describe an efficient algorithm to find the 95th percentile of all the sizes. Provide a brief explanation and a sample implementation using a distributed computing framework.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
SQL Queries and Basic Python Questions Data Engineer @ Uber Difficulty medium
  1. SQL Query for Top N Records: Write an SQL query to fetch the top 5 employees with the highest salaries from an employees table with columns employee_id, name, and salary.

  2. SQL Query for Aggregate Function: Write an SQL query to find the average salary of employees in each department from a table employees with columns department_id, employee_id, and salary.

  3. Python Function for Factorial: Write a Python function to compute the factorial of a given number.

  4. Python List Comprehension: Write a Python one-liner using list comprehension to create a list of squares of the first 10 natural numbers.

  5. Python String Manipulation: Write a Python function to reverse a given string.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Function to Calculate Square Root with Precision and Caching Software Engineer @ Uber Difficulty medium

Implement a function in Python to calculate the square root of any given number up to 2 decimal points precision. Then, optimize the function and implement a caching mechanism to eliminate redundant calculations. Provide an explanation and the solution.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Get Top Ten Data from Last Column of CSV File Using Python Data Engineer @ Google Difficulty medium

How do you write a Python program to get the top ten data entries based on the last column from a comma-separated flat file (CSV)?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Calculating Median from a Stream of Data Data Engineer @ Yahoo Difficulty hard

Describe an efficient algorithm to calculate the median from a stream of data. How would you implement this in a real-time data processing system?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Check if One String is a Rotation of Another Software Engineer @ Uber Difficulty medium

Given two strings s1 and s2, write a Python function to check if s1 is a rotation of s2. Provide an explanation and the solution.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Combining Data Sets and Posting to an API Software Engineer @ Uber Difficulty medium

Given two data sets, build a Python application to combine the data and post the combined data into an API. Describe the approach and provide a sample implementation.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
MapReduce, IP Pool Management, and Rotated Array Search Data Engineer @ Uber Difficulty hard
  1. MapReduce Random Lines from Large Log Files: Using the MapReduce framework, how would you get X amount of random lines from large log files?

  2. Managing a Pool of IPs: Describe how you would design a system to manage a pool of IP addresses, ensuring efficient allocation and deallocation.

  3. Rotated Array Search: Given a rotated sorted array, write a Python function to search for a target value. Explain your approach and provide the solution.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Designing a Web Scraping Utility for Data Extraction and Storage DevOps Engineer @ Microsoft Difficulty medium

How can a DevOps Engineer design a web scraping utility using Python and Beautiful Soup for data extraction from a chosen website and store the results in MongoDB?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Understanding Linked List Algorithms DevOps Engineer @ Microsoft Difficulty medium

Write a function to reverse a singly linked list. You should be able to transform the list from a sequence such as 1→2→3→41→2→3→4 to 4→3→2→14→3→2→1 using your function.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Handling Under-performing Predictive Models Data Analyst @ Apple Difficulty medium

How would you identify and address potentially under-performing predictive models in a production environment?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Data Analysis on Provided Datasets Data Analyst @ Uber Difficulty hard

Write a SQL query to find the top 5 customers based on total expenditure over the last year.

Additionally, create a Python script to plot a bar chart showing total monthly orders over the same period.

Sample Input Data

orders Table:

| order_id | customer_id | order_date | total_amount |
|----------|-------------|------------|--------------|
| 1        | 1           | 2022-11-20 | 150.75       |
| 2        | 2           | 2022-12-05 | 200.50       |
| 3        | 1           | 2022-12-20 | 75.00        |
| 4        | 3           | 2023-01-05 | 300.00       |
| 5        | 2           | 2023-01-15 | 180.75       |
| 6        | 1           | 2023-02-10 | 120.00       |
| 7        | 4           | 2023-02-20 | 250.00       |
| 8        | 3           | 2023-03-01 | 90.00        |

customers Table:

| customer_id | customer_name | signup_date |
|-------------|---------------|-------------|
| 1           | Alice         | 2021-05-01  |
| 2           | Bob           | 2022-03-15  |
| 3           | Carol         | 2022-07-22  |
| 4           | Dave          | 2022-09-19  |
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Python Scripting Data Analyst @ Amazon Difficulty medium

Write a basic Python function to print a diamond of asterisks given an integer n representing the number of rows in the upper half of the diamond (excluding the middle row).

Input An integer n.

Example For n = 3, the output should be:

  *
 ***
*****
 ***
  *
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Customer Churn Prediction Using SQL and Python Data Analyst @ Uber Difficulty medium

Here is sample data for the transactions table:

| customer_id | month      | amount_spent |
|-------------|------------|--------------|
| 1           | 2022-10-01 | 100.50       |
| 1           | 2022-11-01 | 150.75       |
| 1           | 2022-12-01 | 120.00       |
| 2           | 2022-10-01 | 200.00       |
| 2           | 2022-11-01 | 180.25       |
| 2           | 2022-12-01 | 210.50       |
| 3           | 2022-10-01 | 50.25        |
| 3           | 2022-11-01 | 75.00        |
| 3           | 2022-12-01 | 90.00        |

Additionally, assume the dataset also provides churn information with the following columns: churn_info table:

| customer_id | has_churned |
|-------------|-------------|
| 1           | 0           |
| 2           | 1           |
| 3           | 0           |

Write a SQL query to:

  1. Calculate the average monthly spend per customer for the past year.

Write a Python script that uses the dataset to: 2. Calculate the probability of customer churn using logistic regression. Assume you have columns for customer_id, month, amount_spent, and a target column has_churned.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Sales Performance Analysis Data Analyst @ Uber Difficulty medium

Write a SQL query to find the top 3 products based on total sales value.

Additionally, create a Python script to plot a line graph showing daily total sales over the past month.

Lastly, explain how you would use Excel to calculate and visualize the sales trend.

Input:

transactions Table:

| transaction_id | date       | product_id | quantity | price  |
|----------------|------------|------------|----------|--------|
| 1              | 2023-09-01 | 101        | 2        | 10.00  |
| 2              | 2023-09-01 | 102        | 1        | 20.00  |
| 3              | 2023-09-02 | 101        | 1        | 10.00  |
| 4              | 2023-09-02 | 103        | 3        | 15.00  |
| 5              | 2023-09-03 | 102        | 2        | 20.00  |
| 6              | 2023-09-03 | 101        | 2        | 10.00  |
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Using Python to Calculate Cart Abandonment Rates Data Analyst @ Uber Difficulty medium

Given a dataset that contains the following columns: session_id, user_id, event_type, and timestamp. The event_type column can have values such as view, add_to_cart, and purchase. Write a Python script to calculate the cart abandonment rate, defined as the percentage of sessions where items were added to the cart, but a purchase was not completed.

Input: session_data.csv looks like this:

| session_id | user_id | event_type   | timestamp           |
|------------|---------|--------------|---------------------|
| 1          | 101     | view         | 2023-09-01 10:00:00 |
| 1          | 101     | add_to_cart  | 2023-09-01 10:05:00 |
| 2          | 101     | view         | 2023-09-01 11:00:00 |
| 2          | 101     | purchase     | 2023-09-01 11:15:00 |
| 2          | 101     | add_to_cart  | 2023-09-01 11:10:00 |
| 3          | 102     | view         | 2023-09-02 09:00:00 |
| 3          | 102     | add_to_cart  | 2023-09-02 09:10:00 |
| 4          | 103     | view         | 2023-09-02 10:00:00 |
| 4          | 103     | purchase     | 2023-09-02 10:15:00 |
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Creating a Data Quality Dashboard Data Analyst @ Accenture Difficulty medium

Given a dataset containing customer transaction records in a CSV format, describe how you would create a data quality dashboard to track key metrics like missing values, duplicate entries, and data type mismatches. Specify the tools and technologies you would use.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Forecasting Pricing for TV Model Based on Sales Data Data Analyst @ Amazon Difficulty hard

Written question: How would you forecast pricing for this particular model of TV based on this sales data from last year?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Understanding Linear Regression and P-Value Testing Data Analyst @ Microsoft Difficulty medium

Explain how you would use linear regression to predict a continuous variable and how you would interpret the p-values of the regression coefficients.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Factors and Process for Generating a Forecast for an Item @ Apple Difficulty medium

What factors should be used in generating a forecast for an item, and what should be the process of generation?

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Merging, Sorting, and Ranking Tables in SQL and Python List Comprehensions Business Analyst @ Uber Difficulty medium

Write an SQL query to merge these tables, sort the result by department_name and employee_name, and rank employees within their departments.

Additionally, write a small Python program using list comprehensions to filter and sort a list of dictionaries representing employees.

employees Table:

| employee_id | employee_name | department_id |
|-------------|---------------|---------------|
| 1           | Alice         | 10            |
| 2           | Bob           | 10            |
| 3           | Carol         | 20            |
| 4           | Dave          | 20            |
| 5           | Eve           | 30            |

departments Table:

| department_id | department_name |
|---------------|-----------------|
| 10            | Sales           |
| 20            | Marketing       |
| 30            | Finance         |
Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Solving the 2-Egg Problem Business Analyst @ Uber Difficulty hard

You are given 2 eggs and a building with n floors. The goal is to determine the highest floor from which an egg can be dropped without breaking. Describe a strategy to minimize the number of drops required in the worst-case scenario and write a Python program to implement this strategy for a building with a given number of floors n.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Handling Skewed Distributions and Imbalanced Classes Data Analyst @ Microsoft Difficulty medium

How would you handle skewed distributions and imbalanced classes in a dataset intended for machine learning? Provide specific techniques and tools you would use.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Calculating Mean, Median, and Standard Deviation Business Analyst @ Uber Difficulty easy

You are given a dataset sales_data with a column sales_amount representing the sales figures for different transactions. Write Python code to calculate the mean, median, and standard deviation of the sales amounts.

Solution:

Please sign-in to view the solution

Upvote 0 Downvote
Calculating Weighted Average of Requests per Driver Based on a CSV Dataset Business Analyst @ Uber Difficulty medium

You have a CSV file driver_requests.csv with columns driver_id, num_requests, and hours_driven. Write a Python program to calculate the weighted average of requests per driver, weighted by the number of hours driven.

Solution:

Please sign-in to view the solution