top of page

SQL Optimization and Data Analysis for Music Database

Bess Yang (qy561@nyu.edu),  Iris Lu (hl5679@nyu.edu), Chloe Kwon (ekk294@nyu.edu)

Project Overview

This project involved analyzing and optimizing SQL queries on a large music database. We explored the data to answer various questions, such as identifying tracks with specific characteristics, most listened-to tracks, and artists who collaborated with the most album producers. In the second part, we focused on optimizing query execution time using indexing strategies to enhance performance.

Languages, Platforms, and Tools

  • Languages: Python, SQL

  • Tools: SQLite, sqlitebrowser (for visualizing and interacting with the database)

  • Platforms: Local machine, GitHub (for version control)

My Contributions

  • In this project, we did not split or distribute the work among team members. Instead, each of us completed the entire homework independently and then compared our results. We discussed our findings and approaches, ultimately creating a final version that we all agreed upon. I wrote the majority of the results section. 

  • For Part 2, I focused on optimizing the execution time of a specific SQL query that retrieved artist details whose albums received at least 50K listens. I applied indexing on key columns (such as artist.id), reducing execution times from 0.068s (mean) / 0.013s (best) to 0.053s (mean) / 0.011s (best). By using a stepwise indexing approach, I improved performance without over-indexing and ensured that the query ran efficiently.

bottom of page