Filtering VC Funded Startups
Beginner Mode

Start your terminal to use beginner mode.

Objective

We have two DataFrames containing information about Venture Capitalists and the start-ups they have funded. The aim is to find Venture Capitalists that have funded start-ups with an average funding above a certain limit, where the limit is unique for each Venture Capitalist.

Task

Write a PySpark function that combines these DataFrames and returns the Venture Capitalists whose funded start-ups have an average funding amount strictly greater than their corresponding funding_limit.

The avg_funding field should contain the average funding provided by the Venture Capitalist to the startups, cast to a Float. Save your resulting DataFrame as result_df. Ensure the output contains exactly 3 columns matching the Output Schema order, and sort the final output by vc_id in ascending order.

File Path

  • VC Dataset: /home/interview/venture_capitalist.csv
  • Startups Dataset: /home/interview/funded_startups.csv
  • Starter script: /home/interview/vc_funding.py

Schema

venture_capitalist.csv

Column Name Type
vc_id string
vc_name string
funding_limit float

funded_startups.csv

Column Name Type
startup_id string
startup_name string
vc_id string
funding float

Expected Output Schema

Column Name Type
vc_id string
vc_name string
avg_funding float

Example

Given this sample input:

venture_capitalist_df

vc_id vc_name funding_limit
VC1 VC Firm 1 1.5
VC2 VC Firm 2 2.0
VC3 VC Firm 3 1.75
VC4 VC Firm 4 2.5

funded_startups_df

startup_id startup_name vc_id funding
S1 Startup 1 VC1 2.0
S2 Startup 2 VC1 1.0
S3 Startup 3 VC2 2.5
S4 Startup 4 VC2 2.0
S5 Startup 5 VC3 1.8
S6 Startup 6 VC3 1.7
S7 Startup 7 VC4 3.0
S8 Startup 8 VC4 2.0

The expected output would be:

vc_id vc_name avg_funding
VC2 VC Firm 2 2.25

(Explanation: VC2 funded S3 (2.5) and S4 (2.0). The average is 2.25. Since 2.25 > 2.0 (VC2's limit), VC2 is included. VC1's average is 1.5, which is not strictly greater than 1.5, so it is excluded).

Terminal requires a larger screen

Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.

Linux Terminal Environment

Write and execute your solution in the terminal below.

Sign In

Track

Question Difficulty Company Access
Need more practice in this area? Explore more questions →