Lädt...


🔧 Memoire: Create Narrated Videos with AI in Minutes!


Nachrichtenbereich: 🔧 Programmierung
🔗 Quelle: dev.to

This is a submission for the The Pinata Challenge

✒️ Introduction

Creating captivating videos with engaging narratives can be time-consuming and complex. It may even end up unprofessional. Ever try outsourcing a narration/voiceover to someone? Get ready to cough up a good amount of money for that. What if there was a way to simplify this process using AI? And cheaper too?

Meet Memoire, an AI-powered tool designed to create narrated videos in minutes. Whether you're a content creator, a marketer, or just someone who loves sharing stories, Memoire is here to transform your ideas into stunning videos effortlessly.

In this article, I'll walk you through Memoire, showcasing its features, the challenges faced during development, and the exciting possibilities it offers.

Memoire Landing Page

🔐 Key Features

1/ Full-Featured Authentication: Memoire ensures security and user experience with its robust authentication system powered by NextAuth, allowing only verified users to access the app. The system includes beautifully designed emails for account verification and password resets, enhancing both functionality and user engagement.

Memoire Feature Screenshot, #1

2/ Upload Media and Generate Descriptions: You can upload your photos, and Memoire will generate accurate and engaging descriptions for them. If the description is missing important context, you can easily add your input and regenerate a more fitting description.

Memoire Feature Screenshot, #2

3/ Media Transitions: Elevate your video storytelling with Memoire's diverse media transitions, offering options like "fade," "wipeleft," "slideup," and more. These transitions provide a professional touch, ensuring smooth and visually appealing scene changes in your videos.

Memoire Feature Screenshot, #3

4/ Sortable Media List: Uploading photos in batches can sometimes lead to an unpredictable order of completion. With Memoire, you can easily drag and drop media boxes to arrange them in the order you prefer.

Memoire Feature Screenshot, #4

5/ AI Script Generation: Memoire uses Google's Gemini 1.5 Pro model to generate scripts for your videos. This ensures high-quality, contextually relevant scripts that enhance your video narratives.

Memoire Feature Screenshot, #5

6/ AI Audio Generation with Selectable Voices: Powered by OpenAI's TTS-1 model, Memoire offers customizable voices for your narrations. Choose from Echo, Alloy, Fable, Onyx, Nova, and Shimmer to find the perfect voice for your project.

Memoire Feature Screenshot, #6

7/ Project Settings: Customize your project by adding a description, which helps the AI generate better scripts. You can also change your project's aspect ratio and frame rate to suit your needs.

Memoire Feature Screenshot, #7

8/ In-Browser Output Generation: Memoire uses Remotion to generate video previews directly in your browser. Although the preview has some minor differences from the final output, fixes are underway to improve it.

Memoire Feature Screenshot, #8

9/ AI Music Generation: Memoire leverages Meta's Music Gen model to generate background music for your videos. This feature is still a work in progress and is not available for public testing yet.

10/ AI Powered Subtitle Generation: Using OpenAI's Whisper model, Memoire can generate subtitles for your videos. This feature is also in development and will be available soon.

🛠️ Tech Stack

  • FrontEnd: TypeScript, Next.js, DND Kit

  • BackEnd: Next.js API Routes, Server Actions, Prisma

  • Styling: Tailwind CSS, shadcn/ui components

  • File Storage: Pinata

  • Rate Limit: Upstash

  • Authentication: Next Auth

  • AI Models: Google's Gemini 1.5 Pro, OpenAI's TTS-1, Meta's Music Gen, OpenAI's Whisper

  • In-Browser Preview: Remotion

🦄 How I Used Pinata

I had fun trying out a couple of stuff with Pinata! Here they are:

1/ Multi-File Upload Component (w/ Progress Tracking) (MediaPane.tsx):
Pinata's raw API endpoint was leveraged to create a robust multi-file upload component with real-time progress tracking. This approach offers more control, and a better user experience compared to using the SDK.

Key Features:

  • Direct upload to Pinata using axios
  • JWT-based authentication for secure uploads
  • Real-time upload progress tracking

Here's how it works:

a. Fetch JWT for authentication:

const keyRequest = await fetch('/api/key');
const keyData = await keyRequest.json() as { JWT: string };

b. Prepare and send the upload request:

const UPLOAD_ENDPOINT = `https://uploads.pinata.cloud/v3/files`;
const formData = new FormData();
formData.append(`file`, addedFileState.file);

const { data: uploadResponse }: AxiosResponse<{ data: PinataUploadResponse }> = await axios.post(UPLOAD_ENDPOINT, formData, {
    headers: {
        Authorization: `Bearer ${keyData.JWT}`
    },
    onUploadProgress: async (progressEvent) => {
        if (progressEvent.total) {
            const percentComplete = (progressEvent.loaded / progressEvent.total) * 100;
            updateFileProgress(addedFileState.key, percentComplete);
        }
    }
});

c. Track upload progress:

onUploadProgress: async (progressEvent) => {
    if (progressEvent.total) {
        const percentComplete = (progressEvent.loaded / progressEvent.total) * 100;
        updateFileProgress(addedFileState.key, percentComplete);
    }
}

d. Handle the upload response and prepare metadata:

await new Promise(resolve => setTimeout(resolve, 1000));
updateFileProgress(addedFileState.key, 'COMPLETE');

const data = addedFileState.type === 'PHOTO'
    ? await getPhotoDimensions(addedFileState.preview)
    : await getVideoDimensions(addedFileState.preview);

const metadata = { ...data, cid: uploadResponse.data.cid, type: addedFileState.type };

This implementation allows for a seamless upload experience with visual feedback, enhancing user interaction during the potentially time-consuming process of uploading media files.

2/ Custom Image Component (PinataImage.tsx):

A custom PinataImage component was created to efficiently handle image retrieval, caching, and display. This component optimizes performance by reducing unnecessary network requests and leveraging browser storage.

Key Features:

  • Local caching using IndexedDB
  • Signed URL generation for secure access
  • Lazy loading and skeleton placeholders

Here's a breakdown of its functionality:

a. Check for cached images:

const cachedImage = await db.images.where({ cid, width, height }).first();
if (cachedImage) {
    setImageUrl(URL.createObjectURL(cachedImage.blob));
    return;
}

b. Generate signed URL for secure access:

const params = new URLSearchParams({
    cid,
    width: width?.toString() || '',
    height: height?.toString() || '',
    expires
});

const response = await fetch(`/api/getSignedUrl?${params}`);
if (!response.ok) {
    throw new Error('Failed to fetch signed URL');
}

const data = await response.json() as { url: string };

c. Fetch and cache the image:

const imageResponse = await fetch(`/api/getImage?url=${encodeURIComponent(data.url)}`);
if (!imageResponse.ok) {
    throw new Error('Failed to fetch image');
}

const blob = await imageResponse.blob();
const objectUrl = URL.createObjectURL(blob);
setImageUrl(objectUrl);

await db.images.put({ cid, width: Number(width), height: Number(height), blob });

d. Render the image or a skeleton placeholder:

const renderedImage = useMemo(() => {
    if (imageUrl) {
        return (
            <Image
                src={imageUrl}
                unoptimized={!!src}
                width={Number(width)}
                height={Number(height)}
                alt={alt}
                className={className}
                crossOrigin='anonymous'
                {...props}
            />
        );
    } else {
        return (
            <Skeleton className={className} />
        );
    }
}, [imageUrl, width, height, src, alt, className, props]);

This component ensures efficient loading and display of images stored on Pinata, improving the overall performance and user experience of Memoire.

3/ Media Management and Retrieval (VideoPreview.tsx):

In addition to uploading and displaying images, Pinata is used for storing and retrieving various types of media, including audio and video files. This is evident in the VideoPreview component:

a. Retrieve media files using their CIDs:

const getMediaUrl = useCallback(async (cid: string, projectId: string, type: 'media' | 'audio'): Promise<string> => {
    try {
        if (typeof window === 'undefined') {
            return '';
        }

        const table = type === 'media' ? db.media : db.audio;
        let item = await table.where({ cid }).first();
        if (item) {
            return URL.createObjectURL(item.file);
        }

        const response = await fetch(`/api/getFile?cid=${encodeURIComponent(cid)}`);
        if (!response.ok) {
            throw new Error(`HTTP error! status: ${response.status}`);
        }

        const blob = await response.blob();

        await table.put({
            cid,
            file: blob,
            projectId
        });

        return URL.createObjectURL(blob);
    } catch (error) {
        return ''
    }
}, []);

b. Load audio files for narration:

const loadAudio = useCallback(async () => {
    if (narration?.audioCid) {
        const audioUrl = await getMediaUrl(narration.audioCid, project.id, 'audio');
        setLoadedAudioUrl(audioUrl);
        setNarration({ audioUrl });
    }
    // eslint-disable-next-line react-hooks/exhaustive-deps
}, [narration?.audioCid, project.id, getMediaUrl]);

c. Load and sort media items:

const loadMediaItems = useMemo(() => async () => {
    try {
        const loadedItems = await Promise.all(
            mediaItems.map(async (media) => ({
                ...media,
                url: await getMediaUrl(media.cid, project.id, 'media')
            }))
        );

        const sortedMediaItems = [...loadedItems].sort((first, next) =>
            project.mediaOrder.indexOf(first.id) - project.mediaOrder.indexOf(next.id)
        );

        // Compare sortedMediaItems with loadedMediaItems
        const hasChanged = loadedMediaItems.length === 0 ||
            sortedMediaItems.length !== loadedMediaItems.length ||
            sortedMediaItems.some((item, index) => {
                const loadedItem = loadedMediaItems[index];
                return !loadedItem ||
                    item.duration !== loadedItem.duration ||
                    item.transition !== loadedItem.transition;
            });

        if (hasChanged) {
            setLoadedMediaItems(sortedMediaItems);
        }

        await loadAudio();
    } catch (error) {
        console.error('Error loading media items :>>', error);
    }
}, [mediaItems, loadedMediaItems, getMediaUrl, project.id, project.mediaOrder, loadAudio]);

This comprehensive approach to media management allows for efficient storage, retrieval, and playback of various media types within Memoire.

💪 Challenges Faced

1/ Pinata Integration: Working with Pinata was an intriguing experience. Their JavaScript SDK for uploading files presented a challenge: it lacked a built-in method for tracking upload progress, which was crucial for my project to provide users with real-time feedback. Determined to find a solution, I dove into their documentation and discovered that I could use the API directly to achieve this.

Also, instead of following the conventional approach of prefetching signed URLs, I opted for a different route. I made API calls directly from the front end and cached the responses using IndexedDB. This innovative strategy allowed me to load each file only once, significantly minimizing the number of API calls to Pinata and ultimately saving on credits 😬. It was a rewarding challenge that pushed me to think creatively and efficiently!

2/ AI Integration: Integrating AI services for narration and script generation was a significant challenge. Ensuring that the AI produces high-quality output required extensive testing and fine-tuning. I also ran into rate limits while I was testing aggressively.

3/ User Experience: Creating an intuitive and user-friendly interface was crucial. I spent a considerable amount of time designing and iterating on the UI to ensure it meets users' needs while being aesthetically pleasing. This was a lot tougher for me because I didn't have the time to bring in a designer to work with me ;(.

📸 Screenshots

Memoire Screenshots

🔗 Project Link

Link: https://dub.sh/MemoireDemo

💻 Code Repository

Link: https://git.new/MemoireRepo

⚠ Known Issues

1/ Narration audio not syncing up with video.
2/ Video preview component flickers unnecessarily on first load.

✨ Conclusion

Memoire is designed to simplify video creation. By harnessing the power of AI, I've made it possible to produce high-quality narrated videos in minutes for dirt cheap. Whether you're looking to create content for social media, marketing campaigns, or personal projects, Memoire has you covered.

I'm excited to see what you'll create with Memoire. Feel free to share your feedback and let me know how I can improve. Stay tuned for more updates and features!

...

🔧 Memoire: Create Narrated Videos with AI in Minutes!


📈 82.05 Punkte
🔧 Programmierung

🔧 Aide mémoire Programmation Orientée Objet


📈 31.24 Punkte
🔧 Programmierung

📰 E-Novinfo und Mémoire Vive bündeln ihre Kräfte in Freiburg - IT-Markt


📈 31.24 Punkte
📰 IT Security Nachrichten

🍏 Comment réduire l'utilisation de la mémoire sur votre Mac


📈 31.24 Punkte
🍏 iOS / Mac OS

🎥 International Space Station: Humanity’s Lab in Space (Narrated by Adam Savage)


📈 27.43 Punkte
🎥 Video | Youtube

🎥 NASA Hidden Figure Dorothy J. Vaughan (Narrated by Octavia Spencer)


📈 27.43 Punkte
🎥 Video | Youtube

📰 40,000 AI-Narrated Audiobooks Flood Audible


📈 27.43 Punkte
📰 IT Security Nachrichten

📰 Calm's new sleep story is 'narrated' by Jimmy Stewart, and it's spookily effective


📈 27.43 Punkte
📰 IT Nachrichten

🍏 Apple TV+ announces new docuseries, ‘John Lennon: Murder Without A Trial,’ narrated by Kiefer Sutherland


📈 27.43 Punkte
🍏 iOS / Mac OS

📰 Apple Books Quietly Launches AI-Narrated Audiobooks


📈 27.43 Punkte
📰 IT Security Nachrichten

🍏 Apple TV+ announces nature docuseries ‘The Secret Lives of Animals,’ narrated by Hugh Bonneville


📈 27.43 Punkte
🍏 iOS / Mac OS

🔧 Create a Stunning Light/Dark Mode Toggle in 5 Minutes 🎨✨


📈 16.71 Punkte
🔧 Programmierung

🔧 How To Create A Mobile App Using Vite, Vue and Ionic Capacitor In 8 Minutes Including Explanation


📈 16.71 Punkte
🔧 Programmierung

🔧 Create Professional Forms in Minutes with ZIGAFORM


📈 16.71 Punkte
🔧 Programmierung

🔧 How to create a newsletter signup form in just 20 minutes with shadcn/UI and Manifest


📈 16.71 Punkte
🔧 Programmierung

🍏 ObjExImg 1.4.1 - Create 3D Models in Minutes.


📈 16.71 Punkte
🍏 iOS / Mac OS

🔧 Custom GPTs: How to create a GPT for your SaaS in less than 20 minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Create a Powerful Hacking Lab: Install Kali Linux in Minutes!


📈 16.71 Punkte
🔧 Programmierung

🔧 How To Create Your Own ChatGPT (Ish) In 5 Minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Development Tools and Platforms: Create your Dev.to + Pipedream Automation in under 20 minutes


📈 16.71 Punkte
🔧 Programmierung

📰 SAP integrated NavigationSuiteScaffold in just 5 minutes to create adaptive navigation UI


📈 16.71 Punkte
🤖 Android Tipps

🍏 Folge 1.13.2 - Create professional visual guides and tutorials in minutes.


📈 16.71 Punkte
🍏 iOS / Mac OS

🔧 Create a Registration API in 15 minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Introducing GameGift: Create and gift personalised games in minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 How To Create a Stub in 5 Minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Create a Login API Endpoint in 4 minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 How to Create a Custom Policy Management Application in Just 5 Minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 5 Minutes to Create a Simple Website Page


📈 16.71 Punkte
🔧 Programmierung

🔧 Create a ChatGPT WhatsApp Bot on Cyclic in Just 5 minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Create a Registration API in 15 minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 How to Create a Custom Claims Management Application in Just 5 Minutes


📈 16.71 Punkte
🔧 Programmierung

🔧 Create a New Rails 7.2 Project with Bootstrap Theme on a Newly Set Up WSL (in Minutes)


📈 16.71 Punkte
🔧 Programmierung

📰 Create Your Own Stable Diffusion UI on AWS in Minutes


📈 16.71 Punkte
🔧 AI Nachrichten

🔧 Create a Complete Computer Vision App in Minutes With Just Two Python Functions


📈 16.71 Punkte
🔧 Programmierung

matomo