S3Q1 · Book Data Analysis¶
⚡ Quick Reference
Five functions on book records (isbn, pages, language, genre):
def get_short_books(book_data):
return {b[0] for b in book_data if b[1] < 200}
def get_medium_books(book_data):
return {b[0] for b in book_data if 200 <= b[1] <= 500}
def get_pages_by_isbn(book_data, isbn):
for b in book_data:
if b[0] == isbn:
return b[1]
return None
def count_by_language(book_data):
counts = {}
for b in book_data:
counts[b[2]] = counts.get(b[2], 0) + 1
return counts
def total_pages_in_genre_lang(book_data, genre, lang):
return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)
Key rules:
- Short: pages < 200 | Medium: 200 <= pages <= 500 | (implied Long: > 500)
- get_short_books and get_medium_books return sets of ISBNs
- get_pages_by_isbn returns None if ISBN not found
- count_by_language returns a dict {language: count}
Problem Statement¶
Problem
Implement five analytical functions on a list of book tuples (isbn, pages, language, genre).
Sample data:
books_data = [
("978-3-16-148410-0", 150, "English", "Thriller"),
("978-0-14-103620-2", 450, "Tamil", "Fantasy"),
("978-1-4028-9467-2", 200, "English", "Fiction"),
("978-0-393-04002-2", 350, "Hindi", "History"),
("978-0-06-112008-4", 300, "English", "Fiction"),
("978-1-60413-970-0", 175, "Bengali", "Mystery"),
("978-0-7432-7356-5", 420, "English", "Science Fiction"),
("978-1-56619-909-4", 100, "Tamil", "Romance"),
("978-1-4088-4994-7", 270, "Telugu", "Biography"),
("978-0-374-53243-2", 540, "English", "Thriller"),
]
Function 1 - get_short_books¶
Return ISBNs of books with fewer than 200 pages as a set:
Pages < 200: 150 (✅), 175 (✅), 100 (✅) → {"978-3-16-148410-0", "978-1-60413-970-0", "978-1-56619-909-4"} ✓
Function 2 - get_medium_books¶
Return ISBNs of books with 200–500 pages inclusive as a set:
def get_medium_books(book_data: list) -> set:
return {b[0] for b in book_data if 200 <= b[1] <= 500}
200 ≤ pages ≤ 500: 450, 200, 350, 300, 420, 270 → 6 books ✓
540 is excluded
The book with 540 pages falls in neither short nor medium - it would be "long" (> 500). The 540-page Thriller is excluded from both sets.
Function 3 - get_pages_by_isbn¶
Find the book by ISBN, return its page count or None:
def get_pages_by_isbn(book_data: list, isbn: str) -> int:
for b in book_data:
if b[0] == isbn:
return b[1]
return None
get_pages_by_isbn(books_data, "978-0-7432-7356-5") → 420 ✓
Function 4 - count_by_language¶
Count books per language:
def count_by_language(book_data: list) -> dict:
counts = {}
for b in book_data:
counts[b[2]] = counts.get(b[2], 0) + 1
return counts
| Language | Count |
|---|---|
| English | 5 |
| Tamil | 2 |
| Hindi | 1 |
| Bengali | 1 |
| Telugu | 1 |
→ {"English": 5, "Tamil": 2, "Hindi": 1, "Bengali": 1, "Telugu": 1} ✓
Function 5 - total_pages_in_genre_lang¶
Sum pages for books matching both genre and language:
def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)
Fiction + English: "978-1-4028-9467-2" (200) + "978-0-06-112008-4" (300) = 500 ✓
Complete solution approaches¶
def get_short_books(book_data: list) -> set:
return {b[0] for b in book_data if b[1] < 200}
def get_medium_books(book_data: list) -> set:
return {b[0] for b in book_data if 200 <= b[1] <= 500}
def get_pages_by_isbn(book_data: list, isbn: str):
return next((b[1] for b in book_data if b[0] == isbn), None)
def count_by_language(book_data: list) -> dict:
counts = {}
for b in book_data:
counts[b[2]] = counts.get(b[2], 0) + 1
return counts
def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)
def get_short_books(book_data: list) -> set:
result = set()
for b in book_data:
if b[1] < 200:
result.add(b[0])
return result
def get_medium_books(book_data: list) -> set:
result = set()
for b in book_data:
if 200 <= b[1] <= 500:
result.add(b[0])
return result
def get_pages_by_isbn(book_data: list, isbn: str):
for b in book_data:
if b[0] == isbn:
return b[1]
return None
def count_by_language(book_data: list) -> dict:
counts = {}
for b in book_data:
lang = b[2]
if lang in counts:
counts[lang] += 1
else:
counts[lang] = 1
return counts
def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
total = 0
for b in book_data:
if b[3] == genre and b[2] == lang:
total += b[1]
return total
from collections import Counter
def get_short_books(book_data: list) -> set:
return {b[0] for b in book_data if b[1] < 200}
def get_medium_books(book_data: list) -> set:
return {b[0] for b in book_data if 200 <= b[1] <= 500}
def get_pages_by_isbn(book_data: list, isbn: str):
return next((b[1] for b in book_data if b[0] == isbn), None)
def count_by_language(book_data: list) -> dict:
return dict(Counter(b[2] for b in book_data))
def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)
Key takeaways¶
Return sets for get_short/medium_books
The return type is set - use a set comprehension {b[0] for b in ...}. Sets automatically deduplicate and have faster membership testing than lists.
next(..., None) for safe ISBN lookup
next((b[1] for b in book_data if b[0] == isbn), None) returns the page count for the first matching book, or None if no match is found - no explicit loop or try/except needed.
Two-condition filter for total_pages_in_genre_lang
Both conditions b[3] == genre and b[2] == lang must hold. Short-circuit and skips the language check if the genre already doesn't match - efficient for data with many genres.