If there is a long latency for a Java application running on a server, what could be the possible causes and how would you diagnose and resolve them?

Please sign-in to view the solution

You are tasked by a Cloud Provider to create a CDN Product similar to Cloudflare. Design the architecture for the CDN service, focusing on content caching, load balancing, and global content delivery.

Please sign-in to view the solution

Design a distributed file storage system similar to Google Drive or Dropbox, capable of storing and retrieving large amounts of data across a distributed network. Consider the following aspects:

Please sign-in to view the solution

How would you implement a caching proxy server? Describe the architecture, key components, and the steps involved in building it.

Please sign-in to view the solution

What is an efficient dynamic programming approach to solve the classic stairway problem using Python that optimizes both time and space complexity?

Please sign-in to view the solution

How can I efficiently sort and remove duplicates from three different lists in Python, ensuring a single, sorted list as output?

Please sign-in to view the solution

How can I efficiently perform a series of operations on an array to solve a given problem as part of a DevOps automation script, considering constraints like execution time and resources?

Please sign-in to view the solution

How can I efficiently merge two sets of CSV files using standard input, where the first line in each file identifies the dataset, ensuring a seamless data integration process?

Please sign-in to view the solution

How can Java collections be effectively integrated with SQL databases to enhance backend systems' performance and scalability?

Please sign-in to view the solution

Given a dataset of 2 billion URLs and their sizes, describe an efficient algorithm to find the **95th percentile** of all the sizes. Provide a brief explanation and a sample implementation using a distributed computing framework.

Please sign-in to view the solution

**SQL Query for Top N Records:**Write an SQL query to fetch the top 5 employees with the highest salaries from an`employees`

table with columns`employee_id`

,`name`

, and`salary`

.**SQL Query for Aggregate Function:**Write an SQL query to find the average salary of employees in each department from a table`employees`

with columns`department_id`

,`employee_id`

, and`salary`

.**Python Function for Factorial:**Write a Python function to compute the factorial of a given number.**Python List Comprehension:**Write a Python one-liner using list comprehension to create a list of squares of the first 10 natural numbers.**Python String Manipulation:**Write a Python function to reverse a given string.

Please sign-in to view the solution

Implement a function in Python to calculate the square root of any given number up to 2 decimal points precision. Then, optimize the function and implement a caching mechanism to eliminate redundant calculations. Provide an explanation and the solution.

Please sign-in to view the solution

How do you write a Python program to get the top ten data entries based on the last column from a comma-separated flat file (CSV)?

Please sign-in to view the solution

Describe an efficient algorithm to calculate the **median** from a stream of data. How would you implement this in a real-time data processing system?

Please sign-in to view the solution

Given two strings `s1`

and `s2`

, write a Python function to check if `s1`

is a rotation of `s2`

. Provide an explanation and the solution.

Please sign-in to view the solution

Given two data sets, build a Python application to combine the data and post the combined data into an API. Describe the approach and provide a sample implementation.

Please sign-in to view the solution

**MapReduce Random Lines from Large Log Files:**Using the MapReduce framework, how would you get X amount of random lines from large log files?**Managing a Pool of IPs:**Describe how you would design a system to manage a pool of IP addresses, ensuring efficient allocation and deallocation.**Rotated Array Search:**Given a rotated sorted array, write a Python function to search for a target value. Explain your approach and provide the solution.

Please sign-in to view the solution

How can a DevOps Engineer design a web scraping utility using Python and Beautiful Soup for data extraction from a chosen website and store the results in MongoDB?

Please sign-in to view the solution

Write a function to reverse a singly linked list. You should be able to transform the list from a sequence such as 1→2→3→41→2→3→4 to 4→3→2→14→3→2→1 using your function.

Please sign-in to view the solution

How would you identify and address potentially under-performing predictive models in a production environment?

Please sign-in to view the solution

Write a SQL query to find the top 5 customers based on total expenditure over the last year.

Additionally, create a Python script to plot a bar chart showing total monthly orders over the same period.

### Sample Input Data

`orders`

Table:

```
| order_id | customer_id | order_date | total_amount |
|----------|-------------|------------|--------------|
| 1 | 1 | 2022-11-20 | 150.75 |
| 2 | 2 | 2022-12-05 | 200.50 |
| 3 | 1 | 2022-12-20 | 75.00 |
| 4 | 3 | 2023-01-05 | 300.00 |
| 5 | 2 | 2023-01-15 | 180.75 |
| 6 | 1 | 2023-02-10 | 120.00 |
| 7 | 4 | 2023-02-20 | 250.00 |
| 8 | 3 | 2023-03-01 | 90.00 |
```

`customers`

Table:

```
| customer_id | customer_name | signup_date |
|-------------|---------------|-------------|
| 1 | Alice | 2021-05-01 |
| 2 | Bob | 2022-03-15 |
| 3 | Carol | 2022-07-22 |
| 4 | Dave | 2022-09-19 |
```

Please sign-in to view the solution

Write a basic Python function to print a diamond of asterisks given an integer `n`

representing the number of rows in the upper half of the diamond (excluding the middle row).

**Input**
An integer `n`

.

**Example**
For `n = 3`

, the output should be:

```
*
***
*****
***
*
```

Please sign-in to view the solution

Here is sample data for the `transactions`

table:

```
| customer_id | month | amount_spent |
|-------------|------------|--------------|
| 1 | 2022-10-01 | 100.50 |
| 1 | 2022-11-01 | 150.75 |
| 1 | 2022-12-01 | 120.00 |
| 2 | 2022-10-01 | 200.00 |
| 2 | 2022-11-01 | 180.25 |
| 2 | 2022-12-01 | 210.50 |
| 3 | 2022-10-01 | 50.25 |
| 3 | 2022-11-01 | 75.00 |
| 3 | 2022-12-01 | 90.00 |
```

Additionally, assume the dataset also provides churn information with the following columns:
`churn_info`

table:

```
| customer_id | has_churned |
|-------------|-------------|
| 1 | 0 |
| 2 | 1 |
| 3 | 0 |
```

Write a SQL query to:

- Calculate the average monthly spend per customer for the past year.

Write a Python script that uses the dataset to:
2. Calculate the probability of customer churn using logistic regression. Assume you have columns for `customer_id`

, `month`

, `amount_spent`

, and a target column `has_churned`

.

Please sign-in to view the solution

Write a SQL query to find the top 3 products based on total sales value.

Additionally, create a Python script to plot a line graph showing daily total sales over the past month.

Lastly, explain how you would use Excel to calculate and visualize the sales trend.

**Input:**

`transactions`

Table:

```
| transaction_id | date | product_id | quantity | price |
|----------------|------------|------------|----------|--------|
| 1 | 2023-09-01 | 101 | 2 | 10.00 |
| 2 | 2023-09-01 | 102 | 1 | 20.00 |
| 3 | 2023-09-02 | 101 | 1 | 10.00 |
| 4 | 2023-09-02 | 103 | 3 | 15.00 |
| 5 | 2023-09-03 | 102 | 2 | 20.00 |
| 6 | 2023-09-03 | 101 | 2 | 10.00 |
```

Please sign-in to view the solution

Given a dataset that contains the following columns: `session_id`

, `user_id`

, `event_type`

, and `timestamp`

. The `event_type`

column can have values such as `view`

, `add_to_cart`

, and `purchase`

. Write a Python script to calculate the cart abandonment rate, defined as the percentage of sessions where items were added to the cart, but a purchase was not completed.

**Input:**
`session_data.csv`

looks like this:

```
| session_id | user_id | event_type | timestamp |
|------------|---------|--------------|---------------------|
| 1 | 101 | view | 2023-09-01 10:00:00 |
| 1 | 101 | add_to_cart | 2023-09-01 10:05:00 |
| 2 | 101 | view | 2023-09-01 11:00:00 |
| 2 | 101 | purchase | 2023-09-01 11:15:00 |
| 2 | 101 | add_to_cart | 2023-09-01 11:10:00 |
| 3 | 102 | view | 2023-09-02 09:00:00 |
| 3 | 102 | add_to_cart | 2023-09-02 09:10:00 |
| 4 | 103 | view | 2023-09-02 10:00:00 |
| 4 | 103 | purchase | 2023-09-02 10:15:00 |
```

Please sign-in to view the solution

Given a dataset containing customer transaction records in a CSV format, describe how you would create a data quality dashboard to track key metrics like missing values, duplicate entries, and data type mismatches. Specify the tools and technologies you would use.

Please sign-in to view the solution

Written question: How would you forecast pricing for this particular model of TV based on this sales data from last year?

Please sign-in to view the solution

Explain how you would use linear regression to predict a continuous variable and how you would interpret the p-values of the regression coefficients.

Please sign-in to view the solution

What factors should be used in generating a forecast for an item, and what should be the process of generation?

Please sign-in to view the solution

Write an SQL query to merge these tables, sort the result by `department_name`

and `employee_name`

, and rank employees within their departments.

Additionally, write a small Python program using list comprehensions to filter and sort a list of dictionaries representing employees.

`employees`

Table:

```
| employee_id | employee_name | department_id |
|-------------|---------------|---------------|
| 1 | Alice | 10 |
| 2 | Bob | 10 |
| 3 | Carol | 20 |
| 4 | Dave | 20 |
| 5 | Eve | 30 |
```

`departments`

Table:

```
| department_id | department_name |
|---------------|-----------------|
| 10 | Sales |
| 20 | Marketing |
| 30 | Finance |
```

Please sign-in to view the solution

You are given 2 eggs and a building with n floors. The goal is to determine the highest floor from which an egg can be dropped without breaking. Describe a strategy to minimize the number of drops required in the worst-case scenario and write a Python program to implement this strategy for a building with a given number of floors n.

Please sign-in to view the solution

How would you handle skewed distributions and imbalanced classes in a dataset intended for machine learning? Provide specific techniques and tools you would use.

Please sign-in to view the solution

You are given a dataset `sales_data`

with a column `sales_amount`

representing the sales figures for different transactions. Write Python code to calculate the mean, median, and standard deviation of the sales amounts.

Please sign-in to view the solution

You have a CSV file `driver_requests.csv`

with columns `driver_id`

, `num_requests`

, and `hours_driven`

. Write a Python program to calculate the weighted average of requests per driver, weighted by the number of hours driven.

Please sign-in to view the solution