This commit is contained in:
2023-09-22 10:02:55 -04:00
commit 44e64fcbdb
30 changed files with 1082 additions and 0 deletions

View File

@ -0,0 +1,58 @@
## Optimizing Queries and Data Comparisons
This challenge will take what we already know about SQL and Python and add the necessary tools to optimize how we make queries and compare data.
#### Setup
In create_db.py, initialize and create a new database "schedules.db" and give it the necessary tables to hold the following:
Students have a name, and a major.
Classes have a title, and a field of study.
Major and field of study are the same, ie. Economics is a major and a field of study.
Students have many classes.
Classes have many students.
Write a seed_db() function that takes the existing data in create db and randomly fills the database with it.
#### Query
Write a function shared_classes() in Python that takes two student names. It should return the classes they take together, if any. How you want to do this is open ended.
What is the Big O time complexity of your algorithm for loading the students and comparing their classes? Write it down, and benchmark your code.
#### First Optimization - the Index
Read [how indexes work](http://www.programmerinterview.com/index.php/database-sql/what-is-an-index/) here. What type of data structure does sqlite3 use to hold indexes?
Now, add an index to one or more of your tables on columns you think would be appropriate.
Benchmark again - what is your speed now?
#### Second Optimization - use a Set
Are you comparing the data returned from the DB in Python? If so, you probably aren't using the right data type. We haven't covered Sets yet - read about them [here](https://docs.python.org/3.4/library/stdtypes.html#set)
A set is very similar to a dictionary, only it does not store values. Only keys. The syntax is like so:
```
{'Programming', 'Calculus', 'Literature'}
```
Use sets to store the returned class data about both students and use set's built in methods to find the intersection if it exists.
The Big O complexity of this operation is `O(len(x) * len(y))`. If you did the most simple comparison with arrays, it was probably `O(len(x) * len(y)**2)`.
Read about sets. Why is this? Here's a [list](https://wiki.python.org/moin/TimeComplexity) of all Python datatype method's time complexity.
#### Third Optimization - a better (longer!) Query
Instead of doing all your comparisons on the Python side, can we pull the information straight out of the database with a more advanced query?
Before you write the query to replace your original answers- let's write a new function that takes a Major and number of students, and returns classes where that number of students or more in that class have that major.
Now, write your enhanced query to find the intersect without any Python data parsing.
Benchmark this function against your first two.
Sandbox!! You will want to get familiar with the following SQL commands: GROUP BY, HAVING, IN.

View File

@ -0,0 +1,124 @@
def classes():
return ['Pharmacology for AT',
'Human Nutrition',
'Comp Networks & Security',
'Principles of Counseling',
'Intro to Human Com',
'Mass Comm',
'Public Relations',
'Com Relations in Orgs',
'Survey of Com Research',
'Technology and Society',
'Intro Comps & Office',
'Intro to Microsoft Excel',
'Intro to Microsoft Acces',
'Computer Science II',
'Computer Science II Lab',
'Logic for Comp Scientist',
'Comp NW Culture: Music',
'Comp NW Social Systems',
'Economic Life',
'Principle Microeconomics',
'The Global Economy',
'Exceptionalities',
'Continuation Prep Math',
'Internship I',
'Engineering Fundamentals',
'Acad. Writing & Reading',
'Research Writing',
'Business Writing',
'British Texts',
'Post-Colonial Texts',
'Technical Writing',
'Intro to Poetry Writing',
'Intro to Food Science',
'Farm Business Management',
'Food Plant San and HACCP',
'Food Microbiology',
'Food Chem and Analysis',
'Food Laws and Regulation',
'West. Civ. to 1500',
'West & World since 1500',
'Tech Based Ventures',
'I&E Seminar Series',
'Illustration',
'E-Commerce Advertising',
'Psychology of Sport',
'Human Resources Mgt',
'Internship in IS',
'Principles of Marketing',
'Music Listening',
'Music in Western Culture',
'Patho Across Lifespan',
'Found of Research & EBP',
'Transition Role Prof Nur',
'Nur Role for Unlicensed',
'Nursing',
'Ldrship & Mngmt',
'Colab Impv Pat Hlth Outc',
'Honors Project Seminar',
'Professional Devel II',
'Professional Devel IV',
'Integ Office Software',
'Admin Office Management',
'Self as Leader',
'Leading Change',
'Political Life',
'International Politics',
'Cyber Crime',
'Topics Criminal Justice',
'Intro to Psychology Lab',
'Psych of Disabilities',
'Childhood & Adolescence',
'Psychology Men and Women',
'Forensic Psychology',
'Honors Pharmacology Res',
'Drug & Alcohol Abuse',
'Rehab Internship',
'Educational Interpreting',
'SLI Senior Capstone',
'Intro to Sociology',
'Grow/Change Urb Society',
'Womens Studies',
'Ethics in Engineering',
'Adv Ergon',
'Lean Proc Impr Engr',
'Comp Networks & Security Lab',
'Host Computer Security',
'Information Security',
'Adv. Comp. Networks',
'CNL Theory and Practice',
'Techniques of Counseling',
'Stats Res for Counseling',
'Group Background &Theory',
'Asses & Eval in Counsel',
'Marriage',
'Coun Life-Span Develop',
'Pro Orient Eth & Leg Iss',
'Multicultural Counseling',
'Prin & Prac of Schl Coun',
'Human Sexuality Counsel',
'Clin Assess in Cnl Prac',
'Diagnosis Clin Cnl Prac',
'Comp Sys & Structures',
'Functional & Logic Prog.',
'Information Retrieval',
'Eco Applica Internet II',
'Read & Lit II: Int Spec',
'Teach in the Amer Ed Sys',
'Action Research: Science',
'Diagnosis and Assessment',
'Pract I: Intervention',
'Literacy Inquiry Project',
'Ldrshp Schl Improvement',
'Analysis of Teaching',
'Tchr Ldr Masters Exit',
'Data Driven Decisions',
'Advanced Ed Measurement',
'Bldg-Level Leadership',
'Bldg Budget',
'Principal Practicum',
'Adv Tchr Ldr Seminar',
'Pol & Soc Contexts',
'Organizational Behavior',
'Superintendent Practicum']

View File

@ -0,0 +1,11 @@
import sqlite3
import classes
import majors
import random
from faker import Faker
def create():
pass
def seed():
pass

View File

View File

@ -0,0 +1,41 @@
def majors():
return ['Biology',
'Chemistry',
'Mathematics',
'Political Science',
'Psychology',
'Social Sciences',
'Social Work',
'Sociology',
'Speech Pathology',
'Accounting',
'Finance' ,
'Management',
'Management Information Systems',
'Marketing',
'Family & Consumer Sciences',
'Child & Family Studies',
'Dietetics',
'Retail Merchandising',
'Art',
'Ceramics',
'Drawing',
'Graphic Design',
'Painting',
'Photography',
'Printmaking',
'Sculpture',
'New Media',
'Mass Communication',
'Music',
'Composition',
'Instrumental Performance',
'Organ Performance',
'Piano Pedagogy',
'Piano Performance',
'Vocal Performance',
'Theatre',
'Acting',
'Directing',
'Design and Technology',
'Musical Theater']