• Nir Zicherman – How AI Image Models Work

    https://every.to/p/how-ai-image-models-work

     

    By drawing an analogy to a children’s game where noise in sentences must be corrected to reveal coherent plots, Zicherman elucidates how AI models iteratively remove noise from images to generate clear visuals. The process involves training AI to recognize patterns in noisy data and directing it with specific textual prompts to produce desired images. This demystifies the complex mathematics and computing underlying modern AI image generation.

     

     

  • sRGB vs REC709 – An introduction and FFmpeg implementations

    ,

    1. Basic Comparison

    • What they are
      • sRGB: A standard “web”/computer-display RGB color space defined by IEC 61966-2-1. It’s used for most monitors, cameras, printers, and the vast majority of images on the Internet.
      • Rec. 709: An HD-video color space defined by ITU-R BT.709. It’s the go-to standard for HDTV broadcasts, Blu-ray discs, and professional video pipelines.
    • Why they exist
      • sRGB: Ensures consistent colors across different consumer devices (PCs, phones, webcams).
      • Rec. 709: Ensures consistent colors across video production and playback chains (cameras → editing → broadcast → TV).
    • What you’ll see
      • On your desktop or phone, images tagged sRGB will look “right” without extra tweaking.
      • On an HDTV or video-editing timeline, footage tagged Rec. 709 will display accurate contrast and hue on broadcast-grade monitors.

    2. Digging Deeper

    FeaturesRGBRec. 709
    White pointD65 (6504 K), same for bothD65 (6504 K)
    Primaries (x,y)R: (0.640, 0.330) G: (0.300, 0.600) B: (0.150, 0.060)R: (0.640, 0.330) G: (0.300, 0.600) B: (0.150, 0.060)
    Gamut sizeIdentical triangle on CIE 1931 chartIdentical to sRGB
    Gamma / transferPiecewise curve: approximate 2.2 with linear toePure power-law γ≈2.4 (often approximated as 2.2 in practice)
    Matrix coefficientsN/A (pure RGB usage)Y = 0.2126 R + 0.7152 G + 0.0722 B (Rec. 709 matrix)
    Typical bit-depth8-bit/channel (with 16-bit variants)8-bit/channel (10-bit for professional video)
    Usage metadataTagged as “sRGB” in image files (PNG, JPEG, etc.)Tagged as “bt709” in video containers (MP4, MOV)
    Color rangeFull-range RGB (0–255)Studio-range Y′CbCr (Y′ [16–235], Cb/Cr [16–240])


    Why the Small Differences Matter

    (more…)