S3Q1 · Book Data Analysis¶

⚡ Quick Reference

Five functions on book records (isbn, pages, language, genre):

def get_short_books(book_data):
    return {b[0] for b in book_data if b[1] < 200}

def get_medium_books(book_data):
    return {b[0] for b in book_data if 200 <= b[1] <= 500}

def get_pages_by_isbn(book_data, isbn):
    for b in book_data:
        if b[0] == isbn:
            return b[1]
    return None

def count_by_language(book_data):
    counts = {}
    for b in book_data:
        counts[b[2]] = counts.get(b[2], 0) + 1
    return counts

def total_pages_in_genre_lang(book_data, genre, lang):
    return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)

Key rules: - Short: pages < 200 | Medium: 200 <= pages <= 500 | (implied Long: > 500) - get_short_books and get_medium_books return sets of ISBNs - get_pages_by_isbn returns None if ISBN not found - count_by_language returns a dict {language: count}

Problem Statement¶

Problem

Implement five analytical functions on a list of book tuples (isbn, pages, language, genre).

Sample data:

books_data = [
    ("978-3-16-148410-0", 150, "English", "Thriller"),
    ("978-0-14-103620-2", 450, "Tamil",   "Fantasy"),
    ("978-1-4028-9467-2", 200, "English", "Fiction"),
    ("978-0-393-04002-2", 350, "Hindi",   "History"),
    ("978-0-06-112008-4", 300, "English", "Fiction"),
    ("978-1-60413-970-0", 175, "Bengali", "Mystery"),
    ("978-0-7432-7356-5", 420, "English", "Science Fiction"),
    ("978-1-56619-909-4", 100, "Tamil",   "Romance"),
    ("978-1-4088-4994-7", 270, "Telugu",  "Biography"),
    ("978-0-374-53243-2", 540, "English", "Thriller"),
]

Function 1 - `get_short_books`¶

Return ISBNs of books with fewer than 200 pages as a set:

def get_short_books(book_data: list) -> set:
    return {b[0] for b in book_data if b[1] < 200}

Pages < 200: 150 (✅), 175 (✅), 100 (✅) → {"978-3-16-148410-0", "978-1-60413-970-0", "978-1-56619-909-4"} ✓

Function 2 - `get_medium_books`¶

Return ISBNs of books with 200–500 pages inclusive as a set:

def get_medium_books(book_data: list) -> set:
    return {b[0] for b in book_data if 200 <= b[1] <= 500}

200 ≤ pages ≤ 500: 450, 200, 350, 300, 420, 270 → 6 books ✓

540 is excluded

The book with 540 pages falls in neither short nor medium - it would be "long" (> 500). The 540-page Thriller is excluded from both sets.

Function 3 - `get_pages_by_isbn`¶

Find the book by ISBN, return its page count or None:

def get_pages_by_isbn(book_data: list, isbn: str) -> int:
    for b in book_data:
        if b[0] == isbn:
            return b[1]
    return None

get_pages_by_isbn(books_data, "978-0-7432-7356-5") → 420 ✓

Function 4 - `count_by_language`¶

Count books per language:

def count_by_language(book_data: list) -> dict:
    counts = {}
    for b in book_data:
        counts[b[2]] = counts.get(b[2], 0) + 1
    return counts

Language	Count
English	5
Tamil	2
Hindi	1
Bengali	1
Telugu	1

→ {"English": 5, "Tamil": 2, "Hindi": 1, "Bengali": 1, "Telugu": 1} ✓

Function 5 - `total_pages_in_genre_lang`¶

Sum pages for books matching both genre and language:

def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
    return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)

Fiction + English: "978-1-4028-9467-2" (200) + "978-0-06-112008-4" (300) = 500 ✓

Complete solution approaches¶

Pythonic (set + dict comprehensions)Explanatory (loops)Using Counter + filter

def get_short_books(book_data: list) -> set:
    return {b[0] for b in book_data if b[1] < 200}

def get_medium_books(book_data: list) -> set:
    return {b[0] for b in book_data if 200 <= b[1] <= 500}

def get_pages_by_isbn(book_data: list, isbn: str):
    return next((b[1] for b in book_data if b[0] == isbn), None)

def count_by_language(book_data: list) -> dict:
    counts = {}
    for b in book_data:
        counts[b[2]] = counts.get(b[2], 0) + 1
    return counts

def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
    return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)

def get_short_books(book_data: list) -> set:
    result = set()
    for b in book_data:
        if b[1] < 200:
            result.add(b[0])
    return result

def get_medium_books(book_data: list) -> set:
    result = set()
    for b in book_data:
        if 200 <= b[1] <= 500:
            result.add(b[0])
    return result

def get_pages_by_isbn(book_data: list, isbn: str):
    for b in book_data:
        if b[0] == isbn:
            return b[1]
    return None

def count_by_language(book_data: list) -> dict:
    counts = {}
    for b in book_data:
        lang = b[2]
        if lang in counts:
            counts[lang] += 1
        else:
            counts[lang] = 1
    return counts

def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
    total = 0
    for b in book_data:
        if b[3] == genre and b[2] == lang:
            total += b[1]
    return total

from collections import Counter

def get_short_books(book_data: list) -> set:
    return {b[0] for b in book_data if b[1] < 200}

def get_medium_books(book_data: list) -> set:
    return {b[0] for b in book_data if 200 <= b[1] <= 500}

def get_pages_by_isbn(book_data: list, isbn: str):
    return next((b[1] for b in book_data if b[0] == isbn), None)

def count_by_language(book_data: list) -> dict:
    return dict(Counter(b[2] for b in book_data))

def total_pages_in_genre_lang(book_data: list, genre: str, lang: str) -> int:
    return sum(b[1] for b in book_data if b[3] == genre and b[2] == lang)

Key takeaways¶

01

Return sets for get_short/medium_books

The return type is set - use a set comprehension {b[0] for b in ...}. Sets automatically deduplicate and have faster membership testing than lists.

02

next(..., None) for safe ISBN lookup

next((b[1] for b in book_data if b[0] == isbn), None) returns the page count for the first matching book, or None if no match is found - no explicit loop or try/except needed.

03

Two-condition filter for total_pages_in_genre_lang

Both conditions b[3] == genre and b[2] == lang must hold. Short-circuit and skips the language check if the genre already doesn't match - efficient for data with many genres.