sql

Solving a LinkedIn Data Science SQL Interview Question

Bram Adams

Jan 27, 2023 — 1 min read

 https://www.youtube.com/watch?v=d2920ysu2hQ&ab_channel=JayFeng

 01:37

peepoThink (1)

we could use GROUP BY to see how many posting share the same uid
we could then SELECT WHERE count == 1
(2)
use DATE_TRUNC to group by year or
use WHERE to split each into 180 day intervals from now()

 02:13

visualize the output and work backwards, make an output table, split into intermediate tables to process

 05:55

WITH user_job AS (
	SELECT u_id, j_id, COUNT(DISTINCT date_posted) AS num_posted
	FROM jobs
	GROUP BY u_id, j_id
)

 08:06

SELECT
	SUM(CASE WHEN avg_posted > 1 THEN 1 END) AS posted_multiple
	, SUM(CASE WHEN avg_posted = 1 THEN 1 END) AS posted_once
FROM (
	SELECT
		user_id, AVG(num_posted) AS avg_posted
	FROM user_job
	GROUP BY user_id
) AS t

 10:41

HAVING won't work because it will filter values

Post Mortem

WITH - named function
COUNT(DISTINCT field_name) - get count of distinct values in field
GROUP_BY creates sub tables on different keys
CASE WHEN - if
END - end the if
ELSE - else
AVG() - get average of field

Use intermediate tables to get to the final table

Issue 58: The Words in Your Journals in the Ears of Others

The sounds of the words you speak may lie on your own lips, but if you do not relinquish them entirely to a listener they never become words, and you say nothing at all. The words die with the sound.