If there is a long latency for a Java application running on a server, what could be the possible causes and how would you diagnose and resolve them?
Please sign-in to view the solution
You are tasked by a Cloud Provider to create a CDN Product similar to Cloudflare. Design the architecture for the CDN service, focusing on content caching, load balancing, and global content delivery.
Please sign-in to view the solution
Design a distributed file storage system similar to Google Drive or Dropbox, capable of storing and retrieving large amounts of data across a distributed network. Consider the following aspects:
Please sign-in to view the solution
How would you implement a caching proxy server? Describe the architecture, key components, and the steps involved in building it.
Please sign-in to view the solution
How can I efficiently perform a series of operations on an array to solve a given problem as part of a DevOps automation script, considering constraints like execution time and resources?
Please sign-in to view the solution
How can I efficiently merge two sets of CSV files using standard input, where the first line in each file identifies the dataset, ensuring a seamless data integration process?
Please sign-in to view the solution
How can I efficiently sort and remove duplicates from three different lists in Python, ensuring a single, sorted list as output?
Please sign-in to view the solution
What is an efficient dynamic programming approach to solve the classic stairway problem using Python that optimizes both time and space complexity?
Please sign-in to view the solution
How can Java collections be effectively integrated with SQL databases to enhance backend systems' performance and scalability?
Please sign-in to view the solution
Describe an efficient algorithm to calculate the median from a stream of data. How would you implement this in a real-time data processing system?
Please sign-in to view the solution
Given a dataset of 2 billion URLs and their sizes, describe an efficient algorithm to find the 95th percentile of all the sizes. Provide a brief explanation and a sample implementation using a distributed computing framework.
Please sign-in to view the solution
SQL Query for Top N Records: Write an SQL query to fetch the top 5 employees with the highest salaries from an
employees
table with columnsemployee_id
,name
, andsalary
.SQL Query for Aggregate Function: Write an SQL query to find the average salary of employees in each department from a table
employees
with columnsdepartment_id
,employee_id
, andsalary
.Python Function for Factorial: Write a Python function to compute the factorial of a given number.
Python List Comprehension: Write a Python one-liner using list comprehension to create a list of squares of the first 10 natural numbers.
Python String Manipulation: Write a Python function to reverse a given string.
Please sign-in to view the solution
Given two strings s1
and s2
, write a Python function to check if s1
is a rotation of s2
. Provide an explanation and the solution.
Please sign-in to view the solution
Given two data sets, build a Python application to combine the data and post the combined data into an API. Describe the approach and provide a sample implementation.
Please sign-in to view the solution
MapReduce Random Lines from Large Log Files: Using the MapReduce framework, how would you get X amount of random lines from large log files?
Managing a Pool of IPs: Describe how you would design a system to manage a pool of IP addresses, ensuring efficient allocation and deallocation.
Rotated Array Search: Given a rotated sorted array, write a Python function to search for a target value. Explain your approach and provide the solution.
Please sign-in to view the solution
Implement a function in Python to calculate the square root of any given number up to 2 decimal points precision. Then, optimize the function and implement a caching mechanism to eliminate redundant calculations. Provide an explanation and the solution.
Please sign-in to view the solution
How do you write a Python program to get the top ten data entries based on the last column from a comma-separated flat file (CSV)?
Please sign-in to view the solution
How can a DevOps Engineer design a web scraping utility using Python and Beautiful Soup for data extraction from a chosen website and store the results in MongoDB?
Please sign-in to view the solution
Write a function to reverse a singly linked list. You should be able to transform the list from a sequence such as 1→2→3→41→2→3→4 to 4→3→2→14→3→2→1 using your function.
Please sign-in to view the solution
Given a dataset containing customer transaction records in a CSV format, describe how you would create a data quality dashboard to track key metrics like missing values, duplicate entries, and data type mismatches. Specify the tools and technologies you would use.
Please sign-in to view the solution
Write a basic Python function to print a diamond of asterisks given an integer n
representing the number of rows in the upper half of the diamond (excluding the middle row).
Input
An integer n
.
Example
For n = 3
, the output should be:
*
***
*****
***
*
Please sign-in to view the solution
Written question: How would you forecast pricing for this particular model of TV based on this sales data from last year?
Please sign-in to view the solution
Explain how you would use linear regression to predict a continuous variable and how you would interpret the p-values of the regression coefficients.
Please sign-in to view the solution
What factors should be used in generating a forecast for an item, and what should be the process of generation?
Please sign-in to view the solution
How would you identify and address potentially under-performing predictive models in a production environment?
Please sign-in to view the solution
How would you handle skewed distributions and imbalanced classes in a dataset intended for machine learning? Provide specific techniques and tools you would use.
Please sign-in to view the solution
Here is sample data for the transactions
table:
| customer_id | month | amount_spent |
|-------------|------------|--------------|
| 1 | 2022-10-01 | 100.50 |
| 1 | 2022-11-01 | 150.75 |
| 1 | 2022-12-01 | 120.00 |
| 2 | 2022-10-01 | 200.00 |
| 2 | 2022-11-01 | 180.25 |
| 2 | 2022-12-01 | 210.50 |
| 3 | 2022-10-01 | 50.25 |
| 3 | 2022-11-01 | 75.00 |
| 3 | 2022-12-01 | 90.00 |
Additionally, assume the dataset also provides churn information with the following columns:
churn_info
table:
| customer_id | has_churned |
|-------------|-------------|
| 1 | 0 |
| 2 | 1 |
| 3 | 0 |
Write a SQL query to:
- Calculate the average monthly spend per customer for the past year.
Write a Python script that uses the dataset to:
2. Calculate the probability of customer churn using logistic regression. Assume you have columns for customer_id
, month
, amount_spent
, and a target column has_churned
.
Please sign-in to view the solution
Write a SQL query to find the top 5 customers based on total expenditure over the last year.
Additionally, create a Python script to plot a bar chart showing total monthly orders over the same period.
Sample Input Data
orders
Table:
| order_id | customer_id | order_date | total_amount |
|----------|-------------|------------|--------------|
| 1 | 1 | 2022-11-20 | 150.75 |
| 2 | 2 | 2022-12-05 | 200.50 |
| 3 | 1 | 2022-12-20 | 75.00 |
| 4 | 3 | 2023-01-05 | 300.00 |
| 5 | 2 | 2023-01-15 | 180.75 |
| 6 | 1 | 2023-02-10 | 120.00 |
| 7 | 4 | 2023-02-20 | 250.00 |
| 8 | 3 | 2023-03-01 | 90.00 |
customers
Table:
| customer_id | customer_name | signup_date |
|-------------|---------------|-------------|
| 1 | Alice | 2021-05-01 |
| 2 | Bob | 2022-03-15 |
| 3 | Carol | 2022-07-22 |
| 4 | Dave | 2022-09-19 |
Please sign-in to view the solution
Write a SQL query to find the top 3 products based on total sales value.
Additionally, create a Python script to plot a line graph showing daily total sales over the past month.
Lastly, explain how you would use Excel to calculate and visualize the sales trend.
Input:
transactions
Table:
| transaction_id | date | product_id | quantity | price |
|----------------|------------|------------|----------|--------|
| 1 | 2023-09-01 | 101 | 2 | 10.00 |
| 2 | 2023-09-01 | 102 | 1 | 20.00 |
| 3 | 2023-09-02 | 101 | 1 | 10.00 |
| 4 | 2023-09-02 | 103 | 3 | 15.00 |
| 5 | 2023-09-03 | 102 | 2 | 20.00 |
| 6 | 2023-09-03 | 101 | 2 | 10.00 |
Please sign-in to view the solution
Given a dataset that contains the following columns: session_id
, user_id
, event_type
, and timestamp
. The event_type
column can have values such as view
, add_to_cart
, and purchase
. Write a Python script to calculate the cart abandonment rate, defined as the percentage of sessions where items were added to the cart, but a purchase was not completed.
Input:
session_data.csv
looks like this:
| session_id | user_id | event_type | timestamp |
|------------|---------|--------------|---------------------|
| 1 | 101 | view | 2023-09-01 10:00:00 |
| 1 | 101 | add_to_cart | 2023-09-01 10:05:00 |
| 2 | 101 | view | 2023-09-01 11:00:00 |
| 2 | 101 | purchase | 2023-09-01 11:15:00 |
| 2 | 101 | add_to_cart | 2023-09-01 11:10:00 |
| 3 | 102 | view | 2023-09-02 09:00:00 |
| 3 | 102 | add_to_cart | 2023-09-02 09:10:00 |
| 4 | 103 | view | 2023-09-02 10:00:00 |
| 4 | 103 | purchase | 2023-09-02 10:15:00 |
Please sign-in to view the solution
You are given a dataset sales_data
with a column sales_amount
representing the sales figures for different transactions. Write Python code to calculate the mean, median, and standard deviation of the sales amounts.
Please sign-in to view the solution
Write an SQL query to merge these tables, sort the result by department_name
and employee_name
, and rank employees within their departments.
Additionally, write a small Python program using list comprehensions to filter and sort a list of dictionaries representing employees.
employees
Table:
| employee_id | employee_name | department_id |
|-------------|---------------|---------------|
| 1 | Alice | 10 |
| 2 | Bob | 10 |
| 3 | Carol | 20 |
| 4 | Dave | 20 |
| 5 | Eve | 30 |
departments
Table:
| department_id | department_name |
|---------------|-----------------|
| 10 | Sales |
| 20 | Marketing |
| 30 | Finance |
Please sign-in to view the solution
You are given 2 eggs and a building with n floors. The goal is to determine the highest floor from which an egg can be dropped without breaking. Describe a strategy to minimize the number of drops required in the worst-case scenario and write a Python program to implement this strategy for a building with a given number of floors n.
Please sign-in to view the solution
You have a CSV file driver_requests.csv
with columns driver_id
, num_requests
, and hours_driven
. Write a Python program to calculate the weighted average of requests per driver, weighted by the number of hours driven.
Please sign-in to view the solution