• OpenAILearning
  • Posts
  • Grok Imagine is the Fastest Text-to-Video Generator On The Internet

Grok Imagine is the Fastest Text-to-Video Generator On The Internet

Earn $50–$100 making 12–30 sec videos with SORA

In partnership with

In today’s Newsletter

  1. Grok Imagine is the Fastest Text-to-Video Generator On The Internet

  2. Turn Every Word You Speak in a day Into actionable ideas

  3. Meet OpenTSLM: AI That Understands Medical Time-Series Data

  4. Earn $50–$100 making 12–30 sec videos with SORA

  5. How to use Nano Banana to Make Stunning Images

  6. Google Introduces A Smarter Way to Do Voice Search

  7. Create Viral YouTube Shorts Using SORA 2 + Prompt

Grok Imagine is the Fastest Text-to-Video Generator On The Internet

Turn Every Word You Speak in a day Into actionable ideas

The World’s Most Wearable AI

Limitless is your new superpower - an AI-powered pendant that captures and remembers every conversation, insight, and idea you encounter throughout your day.

Built for tech leaders who need clarity without the clutter, Limitless automatically transcribes and summarizes meetings, identifies speakers, and delivers actionable notes right to your fingertips. It’s securely encrypted, incredibly intuitive, and endlessly efficient.

Order now and reclaim your mental bandwidth today.

Earn $50–$100 making 12–30 sec videos with SORA

How to use Nano Banana to Make Stunning Images

  1. Open Nano Banana, click the plus button, and upload a clear base photo.

  2. In the chat box, write a short prompt and press the arrow to generate.

  3. If the result is close but not perfect, reply with a tiny tweak rather than rewriting the whole prompt.

1) Lock in face and scene consistency

  1. Begin with one base image of the person or product you want to keep consistent.

  2. Make small edits one at a time so Nano Banana “remembers” the look.

  3. Use short prompts that change only the context, not the face.

    • Try: “Same person, business headshot background”

    • Try: “Same outfit, sunset lighting”

  4. If the face drifts, nudge it back.

    • Try: “Keep the same face and expression as the original”

2) Change small details without touching the rest

  1. Point to exactly what you want changed.

  2. Use everyday language. No long descriptions needed.

  3. Iterate with tiny corrections until it feels right.

    • Try: “Make the walls light blue”

    • Try: “Add a small round wooden side table”

    • Try: “The mirror is too large, make it smaller and simpler”

  4. If Nano Banana edits too much, narrow the scope.

    • Try: “Only change the curtains, leave everything else the same”

3) Blend images and restore old photos

  1. Upload up to three images you want to combine.

  2. Tell Nano Banana what to take from each image.

    • Try: “Use the face from image one and the jacket from image two”

    • Try: “Place the dog from image three on the sofa in image one”

  3. For restoration, attach the old photo and ask for careful fixes.

    • Try: “Restore and colorize this photo with natural skin tones”

    • Try: “Remove scratches and keep the original texture”

  4. If colors look off, refine with a gentle correction.

    • Try: “Make the suit navy blue and the background warm cream”

Prompt cheat sheet

  • Keep it short: one idea per sentence

  • Be specific: name the item and the change

  • Iterate: “a bit lighter”, “slightly smaller”, “more natural”

  • Protect the base: “leave the face and lighting the same”

Meet OpenTSLM: AI That Understands Medical Time-Series Data (Source)

A team from Stanford, ETH Zurich, Google Research, and Amazon has unveiled OpenTSLM, a new family of Time-Series Language Models designed to handle complex medical signals like ECGs, EEGs, and wearable data. Unlike GPT-4o and other frontier models that struggle with continuous signals, OpenTSLM treats time-series as a native modality, not just text or images…making it far better at capturing the subtle changes doctors rely on for diagnosis.

The researchers tested two designs. SoftPrompt encoded signals into text-like tokens but required huge memory. The real breakthrough came with OpenTSLM-Flamingo, which keeps data efficient and scalable, using specialized encoders and attention mechanisms to process long streams without breaking down. In trials, it used far less VRAM while still outperforming GPT-4o by a wide margin.

Results were striking. On new benchmarks, OpenTSLM achieved nearly 70% accuracy in sleep-stage detection, compared to GPT-4o’s 15%. Even its smaller 1B-parameter models beat massive general-purpose models, showing the power of specialized AI. At Stanford Hospital, cardiologists reviewing OpenTSLM’s ECG interpretations found its reasoning correct or partially correct in over 90% of cases, praising its ability to explain predictions clearly.

Google Introduces A Smarter Way to Do Voice Search (Source)

Google has introduced Speech-to-Retrieval (S2R), a new system that skips the old step of turning speech into text. Instead, it understands the meaning of your spoken query directly and retrieves the right answer. This makes voice search faster and more accurate, especially when transcription errors would normally change the results.

The model uses two encoders: one processes the audio and the other processes documents. Together, they learn to match spoken questions with the most relevant information. Tests show S2R delivers much better results than traditional methods and works across many languages. Google is already rolling it out in Search and has released a dataset for researchers to build on this progress.

Create Viral YouTube Shorts Using SORA 2

Here is the prompt

A tense, chaotic, first-person POV shot filmed on a shaky, handheld camera from the passenger seat of a beat-up SUV. The camera is pointed backward through the slightly cracked rear window. The SUV is speeding wildly down a narrow, dusty, dirt road in a desolate mountain valley at late afternoon. Filling the frame, a massive, muscular male yak is charging at full speed, mere feet behind the vehicle. The yak's eyes are focused and aggressive, and its movement is violently realistic, reflecting its enormous weight and momentum. Include natural, overlapping, and panicking dialogue from the people inside the vehicle (e.g., muffled shouts like 'Go! Go faster!' and 'It's right on us!'). The audio track must feature the loud, realistic sound of heavy hooves pounding the ground directly behind the car, the roar of the engine, and the rattling of the shaky camera. The lighting is harsh and dusty, giving the whole scene a raw, found-footage documentary realism with complex, real-time interactions.

Top AI tools to check out…

I’ve been building a best AI tool directory where you’ll find the best of the best tool and their USECASES.

How Satisfied Are You with Today’s Newsletter?

Login or Subscribe to participate in polls.

Thanks for your time!

Shailesh & OpenAILearning team