picturarium/TODO.md

# Tasks

- understand `picsum` api and what exactly I need from it
- understand what exactly is meant by caching (it seems the assignment only wants to cache explicit calls to `picsum`, but there's also image browser caching to consider)
- What exactly is Material UI. I assume this is a library of styled components (probably with a React lib available too)
- Bonus: Generating API descriptions via AI. This seems much more complex than the rest.

# Picsum API

TASK: understand picsum api
specifically the `list` endpoint

Seems that in the background they have this huge list of images,
that's stable (i.e. each request given the same input-payload, returns the same list of images).

They use pages. What's a page?

So is this like

```
type Pages =
  List Page

type Page =
  List ImageRef
```

Is there some sort of a limit on a given page? Like at-most 30 images per page or something like that?
How does that work?

## Basic Test

The `https://picsum.photos/200` redirects to e.g. `https://fastly.picsum.photos/id/338/200/200.jpg?hmac=5S5SeR5xW8mbN3Ml7wTTJPePX392JafhcFMGm7IFNy0` which is the image (jpeg).
This basically means:

```
{
  id: 338,
  width: 200,
  height: 200,
  format: "jpg",
}
```

it also includes a hash of the payload/image. Why? Apparently this is for CDN caching.

## Pages

Ok, set `limit = 10` - that's the page size. Experimentally verified that pages start at `1` not `0`.

```
https://picsum.photos/v2/list?limit=10&page=1
```

This is what's returned in the 3rd page (with `limit=10`).

```
[
  {
    "id": "30",
    "author": "Shyamanta Baruah",
    "width": 1280,
    "height": 901,
    "url": "https://unsplash.com/photos/aeVA-j1y2BY",
    "download_url": "https://picsum.photos/id/30/1280/901"
  },
  {
    "id": "31",
    "author": "How-Soon Ngu",
    "width": 3264,
    "height": 4912,
    "url": "https://unsplash.com/photos/7Vz3DtQDT3Q",
    "download_url": "https://picsum.photos/id/31/3264/4912"
  },
  {
    "id": "32",
    "author": "Rodrigo Melo",
    "width": 4032,
    "height": 3024,
    "url": "https://unsplash.com/photos/eG3k60PrTGY",
    "download_url": "https://picsum.photos/id/32/4032/3024"
  },
  {
    "id": "33",
    "author": "Alejandro Escamilla",
    "width": 5000,
    "height": 3333,
    "url": "https://unsplash.com/photos/LBI7cgq3pbM",
    "download_url": "https://picsum.photos/id/33/5000/3333"
  },
  {
    "id": "34",
    "author": "Aleks Dorohovich",
    "width": 3872,
    "height": 2592,
    "url": "https://unsplash.com/photos/zZvsEMPxjIA",
    "download_url": "https://picsum.photos/id/34/3872/2592"
  },
  {
    "id": "35",
    "author": "Shane Colella",
    "width": 2758,
    "height": 3622,
    "url": "https://unsplash.com/photos/znM0ujn2RUA",
    "download_url": "https://picsum.photos/id/35/2758/3622"
  },
  {
    "id": "36",
    "author": "Vadim Sherbakov",
    "width": 4179,
    "height": 2790,
    "url": "https://unsplash.com/photos/osSryggkso4",
    "download_url": "https://picsum.photos/id/36/4179/2790"
  },
  {
    "id": "37",
    "author": "Austin Neill",
    "width": 2000,
    "height": 1333,
    "url": "https://unsplash.com/photos/erTjj730fMk",
    "download_url": "https://picsum.photos/id/37/2000/1333"
  },
  {
    "id": "38",
    "author": "Allyson Souza",
    "width": 1280,
    "height": 960,
    "url": "https://unsplash.com/photos/JabLtzJl8bc",
    "download_url": "https://picsum.photos/id/38/1280/960"
  },
  {
    "id": "39",
    "author": "Luke Chesser",
    "width": 3456,
    "height": 2304,
    "url": "https://unsplash.com/photos/pFqrYbhIAXs",
    "download_url": "https://picsum.photos/id/39/3456/2304"
  }
]
```

Ok so basically:

```
type ImageId = string // TODO: symbol

type ImageRef = string

type Image = {
  id: ImageId,
  author: string,
  width: number,
  height: number,
  url: string,
  download_url: ImageRef
}
```

# Caching

Can I specify sizes better? What about caching?
Assignment says that I should cache the requests to the `picsum` api. This is simple enough.
I just need to cache the

```
https://picsum.photos/v2/list?limit=10&page=1
```

request. Probably something like `useMemo` would be sufficient (or I can do a custom hook).

But what about caching of images themselves? Does browser handle this automagically?
I'm pretty sure that it does, I definitely thought about this before in itravel. Unfortunately I don't remember the conclusion I reached /facepalm
This is just HTTP GET caching, so should be fine.

To confirm:

```
http --follow --headers https://picsum.photos/id/30/300/200
```

gives

- First request

```
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 818206
Cache-Control: public, max-age=2592000, stale-while-revalidate=60, stale-if-error=43200, immutable
Connection: keep-alive
Content-Disposition: inline; filename="30-300x200.jpg"
Content-Length: 10181
Content-Type: image/jpeg
Date: Thu, 14 May 2026 12:48:29 GMT
Picsum-Id: 30
Server: nginx
Timing-Allow-Origin: *
Vary: Origin
Via: 1.1 varnish
X-Cache: HIT
X-Cache-Hits: 0
X-Served-By: cache-fra-eddf8230047-FRA
X-Timer: S1778762910.527297,VS0,VE1
```

- Second request

```
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 818234
Cache-Control: public, max-age=2592000, stale-while-revalidate=60, stale-if-error=43200, immutable
Connection: keep-alive
Content-Disposition: inline; filename="30-300x200.jpg"
Content-Length: 10181
Content-Type: image/jpeg
Date: Thu, 14 May 2026 12:48:57 GMT
Picsum-Id: 30
Server: nginx
Timing-Allow-Origin: *
Vary: Origin
Via: 1.1 varnish
X-Cache: HIT
X-Cache-Hits: 1
X-Served-By: cache-fra-eddf8230177-FRA
X-Timer: S1778762937.389504,VS0,VE1
```

Cache-Control seems to indicate this is cached in CDN.
I also need to test the browser behaviour.

Let's try to test the browser img.src behaviour.

```
var img = new Image();
document.body.appendChild(img);
```

Then

```
img.src = "https://picsum.photos/id/30/300/200"; // makes `GET https://picsum.photos/id/30/300/200` which results 302, then `https://fastly.picsum.photos/id/30/300/200.jpg?hmac=VXfU9CUIzgRHYSjKg8FAl7JDQIea3VOfR8f98SpfXbo` which gives 200
img.src = "https://picsum.photos/id/31/300/200"; // makes `GET https://picsum.photos/id/31/300/200` which results 302, then `https://fastly.picsum.photos/id/31/300/200.jpg?hmac=WPPS-sLIpuyg7q2io4x82NuBdN-FK1W5uDG3iPVzi2g` which gives 200
img.src = "https://picsum.photos/id/30/300/200"; // makes `GET https://picsum.photos/id/30/300/200` which resulst immediately into 200. No further requests are made
```

Cool, as expected browser caches images nicely too.

# Modal

"When an image is clicked, open a modal showing the image in a bigger size."
Will have to setup basic modal, but that's fine.
The point here is that this will just mount a new component (the modal) an in it will be an `<img>` with a different source (the sizes are gonna be different). I don't have to do any manual `fetch`.

# Material UI

Yep, it is available as a react library:

- it even has a modal component `Dialog` which I could use.
- `Button` for the prev/next
- `CircularProgress` for loading

Ideally I would love to use something like a fake list of images that are loading when making the `getImages` request to `picsum` api.
Need to decide if the current page will just have 10 stable (under prev/next) react components, or will they re-mount from scratch.
Yeah, I'm pretty sure these will not remount. What would be the point of that? The images would get stable keys (let's say 1 to 10). But this seems overcomplicated. Let's do something simpler `key={imageId}`. You shouldn't really care about the React remounting all of this. It's not perceptible by humans.
When a page is changed, the state that represents them is set to loading (that's when we don't even have the url computed yet).
But suppose we do compute the `src` - so under the stable keys the underlying `<img>` DOM node's src will get mutated and either the cached image is gonna be displayed, or GET request is gonna be made by the browser.
This is I guess a bit weird... the image is still loading in some sense... Ideally I would prefer to set it to loaded once the real image is loaded for real.

Fuck it. the states should be

```
| LoadingPage // waiting for the `loadImages`. Here display a "ghost" grid of 10 images.
| Loaded {
  , imageId: List<ImageId>
  , modal: Option<{ openedImage: ImageId }> // by default modal is not open, but when an image is clicked, it sets this to `some`
  }
```

# AI generated descriptions

Now this is a bit weird. I'm kinda sceptical that there's a freely available api that would generate image descriptions.
I would need to send the image to the model (I guess there are apis that accept a simple url, I'm pretty sure I don't have to encode the image into binary and send it like that to the model directly).

Another (insane) option would be to bundle some very simple model in the client (like a wasm image), but I bet anything useful is still like atleast ~500 MB lol.

Another option would be to self-host it, but that's kinda a lot of work (setting up the server, exposing api, talking to the model locally, self-hosting. Note even sure my minipc could handle any non-trivial model)

The non-insane option is ofcourse to use some externally hosted model that exposes api endpoint that takes in image url and responds with a description.
The problem again is: I shouldn't really deploy that one then. I would have to expose an API key to the public facing internet. Or even have in github /facepalm, which is terrible.
But then again, this is just atmost a 4 hour task.
I could self-host a tiny api endpoint though on my minipc. But I don't really want to introduce a new server functionality - my pages right now are completely static.

- The easiest reasonable solution would be to find a free (yeah, right) inference service without needing an API key. Unlikely
- Another option would be to get an API key for image-to-text service, and setup a serverless function that would just proxy `image_url` but it would hide the API key.
  I would hardcode the api endpoint url in github. I think this is fine for this particular assignment.
  This is probably what I'm gonna use. The only problem I guess could be CORS. When self-hosting
  TODO: Take a look at cloudflare workers.
  TODO: Decide which image-to-text model to use.