Episode #416 from 25:07
JEPA (Joint-Embedding Predictive Architecture)
What is joint embedding? What are these architectures that you're so excited about? Okay, so now instead of training a system to encode the image and then training it to reconstruct the full image from a corrupted version, you take the full image, you take the corrupted or transformed version, you run them both through encoders, which in general, are identical, but not necessarily. And then you train a predictor on top of those encoders to predict the representation of the full input from the representation of the corrupted one. So joint embedding, because you're taking the full input and the corrupted version or transformed version, run them both through encoders, you get a joint embedding, and then you're saying, can I predict the representation of the full one from the representation of the corrupted one?
People
Why this moment matters
What is joint embedding? What are these architectures that you're so excited about? Okay, so now instead of training a system to encode the image and then training it to reconstruct the full image from a corrupted version, you take the full image, you take the corrupted or transformed version, you run them both through encoders, which in general, are identical, but not necessarily. And then you train a predictor on top of those encoders to predict the representation of the full input from the representation of the corrupted one. So joint embedding, because you're taking the full input and the corrupted version or transformed version, run them both through encoders, you get a joint embedding, and then you're saying, can I predict the representation of the full one from the representation of the corrupted one?
People and topics
People