Search Results for: open source
Visual Code Studio – Free. Built on open source. Runs everywhere code editor.
https://www.freecodecamp.org/news/how-to-set-up-vs-code-for-web-development/
Blue Griffon – the new open source WYSIWYG, NVU like html and CSS editor
BlueGriffon is an open source WYSIWYG editor powered by Gecko, the rendering engine developed for Mozilla Firefox. One of a few derivatives of NVU, a now-discontinued HTML editor, BlueGriffon is the only actively developed NVU derivative that supports HTML5 as well as modern components of CSS.
If your goal is to write as little actual HTML as possible, then BlueGriffon is the tool you want. It’s a true drag-and-drop WYSIWYG website designer, and even includes a dual view option so you can see the code behind your design, in case you want to edit it or just learn from it.
It also supports the EPUB ebook format, so you don’t have to just publish to the web: you can provide your readers with a download of your content that they can take with them. Licensed under the MPL, GPL, and LGPL, a version of BlueGriffon is available for Linux, Windows, and Mac.
Source: opensource.com/alternatives/dreamweaver
Open Source Nvidia Omniverse
blogs.nvidia.com/blog/2019/03/18/omniverse-collaboration-platform/
developer.nvidia.com/nvidia-omniverse
An open, Interactive 3D Design Collaboration Platform for Multi-Tool Workflows to simplify studio workflows for real-time graphics.
It supports Pixar’s Universal Scene Description technology for exchanging information about modeling, shading, animation, lighting, visual effects and rendering across multiple applications.
It also supports NVIDIA’s Material Definition Language, which allows artists to exchange information about surface materials across multiple tools.
With Omniverse, artists can see live updates made by other artists working in different applications. They can also see changes reflected in multiple tools at the same time.
For example an artist using Maya with a portal to Omniverse can collaborate with another artist using UE4 and both will see live updates of each others’ changes in their application.
This AI is so good at writing that its creators at the open source OpenAI won’t let you use it
edition.cnn.com/2019/02/18/tech/dangerous-ai-text-generator/index.html
OpenAI researchers fed the following Lord-of-the-Rings-style prompt to the system: Legolas and Gimli advanced on the orcs, raising their weapons with a harrowing war cry.
The computer composed this appropriately violent addition:
The orcs’ response was a deafening onslaught of claws, claws, and claws; even Elrond was forced to retreat. “You are in good hands, dwarf,” said Gimli, who had been among the first to charge at the orcs; it took only two words before their opponents were reduced to a blood-soaked quagmire, and the dwarf took his first kill of the night.”
The company’s decision to keep it from public use is the latest indication of a growing unease in and about the tech community about building cutting-edge technology — in particular AI —without setting limits on how it can be deployed.
Amazon and Microsoft in particular have voiced their support for legislation to regulate how facial recognition technology can and can’t be used. And Amazon investors and employees (as well as a dozens of civil rights groups) have urged the company to stop selling its face-recognition technology, Rekognition, to government agencies due to concerns it could be used to violate people’s rights.
HandBrake – The open source video transcoder
HandBrake is a tool for converting video from nearly any format to a selection of modern, widely supported codecs
apertus – open source cinema
The goal of the community driven apertus° project is to create a variety of powerful, free (in terms of liberty) and open cinema tools that we as filmmakers love to use.
Foundry Nuke Cattery – A library of open-source machine learning models
The Cattery is a library of free third-party machine learning models converted to .cat files to run natively in Nuke, designed to bridge the gap between academia and production, providing all communities access to different ML models that all run in Nuke. Users will have access to state-of-the-art models addressing segmentation, depth estimation, optical flow, upscaling, denoising, and style transfer, with plans to expand the models hosted in the future.
https://www.foundry.com/insights/machine-learning/the-artists-guide-to-cattery
https://community.foundry.com/cattery
HDRI Resources
Text2Light
- https://www.cgtrader.com/free-3d-models/exterior/other/10-free-hdr-panoramas-created-with-text2light-zero-shot
- https://frozenburning.github.io/projects/text2light/
- https://github.com/FrozenBurning/Text2Light
Royalty free links
- https://locationtextures.com/panoramas/
- http://www.noahwitchell.com/freebies
- https://polyhaven.com/hdris
- https://hdrmaps.com/
- https://www.ihdri.com/
- https://hdrihaven.com/
- https://www.domeble.com/
- http://www.hdrlabs.com/sibl/archive.html
- https://www.hdri-hub.com/hdrishop/hdri
- http://noemotionhdrs.net/hdrevening.html
- https://www.openfootage.net/hdri-panorama/
- https://www.zwischendrin.com/en/browse/hdri
Nvidia GauGAN360
Advanced Computer Vision with Python OpenCV and Mediapipe
https://www.freecodecamp.org/news/advanced-computer-vision-with-python/
https://www.freecodecamp.org/news/how-to-use-opencv-and-python-for-computer-vision-and-ai/
Working for a VFX (Visual Effects) studio provides numerous opportunities to leverage the power of Python and OpenCV for various tasks. OpenCV is a versatile computer vision library that can be applied to many aspects of the VFX pipeline. Here’s a detailed list of opportunities to take advantage of Python and OpenCV in a VFX studio:
- Image and Video Processing:
- Preprocessing: Python and OpenCV can be used for tasks like resizing, color correction, noise reduction, and frame interpolation to prepare images and videos for further processing.
- Format Conversion: Convert between different image and video formats using OpenCV’s capabilities.
- Tracking and Matchmoving:
- Feature Detection and Tracking: Utilize OpenCV to detect and track features in image sequences, which is essential for matchmoving tasks to integrate computer-generated elements into live-action footage.
- Rotoscoping and Masking:
- Segmentation and Masking: Use OpenCV for creating and manipulating masks and alpha channels for various VFX tasks, like isolating objects or characters from their backgrounds.
- Camera Calibration:
- Intrinsic and Extrinsic Calibration: Python and OpenCV can help calibrate cameras for accurate 3D scene reconstruction and camera tracking.
- 3D Scene Reconstruction:
- Stereoscopy: Use OpenCV to process stereoscopic image pairs for creating 3D depth maps and generating realistic 3D scenes.
- Structure from Motion (SfM): Implement SfM techniques to create 3D models from 2D image sequences.
- Green Screen and Blue Screen Keying:
- Chroma Keying: Implement advanced keying algorithms using OpenCV to seamlessly integrate actors and objects into virtual environments.
- Particle and Fluid Simulations:
- Particle Tracking: Utilize OpenCV to track and manipulate particles in fluid simulations for more realistic visual effects.
- Motion Analysis:
- Optical Flow: Implement optical flow algorithms to analyze motion patterns in footage, useful for creating dynamic VFX elements that follow the motion of objects.
- Virtual Set Extension:
- Camera Projection: Use camera calibration techniques to project virtual environments onto physical sets, extending the visual scope of a scene.
- Color Grading:
- Color Correction: Implement custom color grading algorithms to match the color tones and moods of different shots.
- Automated QC (Quality Control):
- Artifact Detection: Develop Python scripts to automatically detect and flag visual artifacts like noise, flicker, or compression artifacts in rendered frames.
- Data Analysis and Visualization:
- Performance Metrics: Use Python to analyze rendering times and optimize the rendering process.
- Data Visualization: Generate graphs and charts to visualize render farm usage, project progress, and resource allocation.
- Automating Repetitive Tasks:
- Batch Processing: Automate repetitive tasks like resizing images, applying filters, or converting file formats across multiple shots.
- Machine Learning Integration:
- Object Detection: Integrate machine learning models (using frameworks like TensorFlow or PyTorch) to detect and track specific objects or elements within scenes.
- Pipeline Integration:
- Custom Tools: Develop Python scripts and tools to integrate OpenCV-based processes seamlessly into the studio’s pipeline.
- Real-time Visualization:
- Live Previsualization: Implement real-time OpenCV-based visualizations to aid decision-making during the preproduction stage.
- VR and AR Integration:
- Augmented Reality: Use Python and OpenCV to integrate virtual elements into real-world footage, creating compelling AR experiences.
- Camera Effects:
- Lens Distortion: Correct lens distortions and apply various camera effects using OpenCV, contributing to the desired visual style.
Interpolating frames from an EXR sequence using OpenCV can be useful when you have only every second frame of a final render and you want to create smoother motion by generating intermediate frames. However, keep in mind that interpolating frames might not always yield perfect results, especially if there are complex changes between frames. Here’s a basic example of how you might use OpenCV to achieve this:
import cv2 import numpy as np import os # Replace with the path to your EXR frames exr_folder = "path_to_exr_frames" # Replace with the appropriate frame extension and naming convention frame_template = "frame_{:04d}.exr" # Define the range of frame numbers you have start_frame = 1 end_frame = 100 step = 2 # Define the output folder for interpolated frames output_folder = "output_interpolated_frames" os.makedirs(output_folder, exist_ok=True) # Loop through the frame range and interpolate for frame_num in range(start_frame, end_frame + 1, step): frame_path = os.path.join(exr_folder, frame_template.format(frame_num)) next_frame_path = os.path.join(exr_folder, frame_template.format(frame_num + step)) if os.path.exists(frame_path) and os.path.exists(next_frame_path): frame = cv2.imread(frame_path, cv2.IMREAD_ANYDEPTH | cv2.IMREAD_COLOR) next_frame = cv2.imread(next_frame_path, cv2.IMREAD_ANYDEPTH | cv2.IMREAD_COLOR) # Interpolate frames using simple averaging interpolated_frame = (frame + next_frame) / 2 # Save interpolated frame output_path = os.path.join(output_folder, frame_template.format(frame_num)) cv2.imwrite(output_path, interpolated_frame) print(f"Interpolated frame {frame_num}") # alternatively: print("Interpolated frame {}".format(frame_num))
Please note the following points:
- The above example uses simple averaging to interpolate frames. More advanced interpolation methods might provide better results, such as motion-based algorithms like optical flow-based interpolation.
- EXR files can store high dynamic range (HDR) data, so make sure to use cv2.IMREAD_ANYDEPTH flag when reading these files.
- OpenCV might not support EXR format directly. You might need to use a library like exr to read and manipulate EXR files, and then convert them to OpenCV-compatible formats.
- Consider the characteristics of your specific render when using interpolation. If there are large changes between frames, the interpolation might lead to artifacts.
- Experiment with different interpolation methods and parameters to achieve the desired result.
- For a more advanced and accurate interpolation, you might need to implement or use existing algorithms that take into account motion estimation and compensation.
Unity 3D resources
http://answers.unity3d.com/questions/12321/how-can-i-start-learning-unity-fast-list-of-tutori.html
If you have no previous experience with Unity, start with these six video tutorials which give a quick overview of the Unity interface and some important features http://unity3d.com/support/documentation/video/
OpenColorIO standard
https://www.provideocoalition.com/color-management-part-11-introducing-opencolorio/
OpenColorIO (OCIO) is a new open source project from Sony Imageworks.
Based on development started in 2003, OCIO enables color transforms and image display to be handled in a consistent manner across multiple graphics applications. Unlike other color management solutions, OCIO is geared towards motion-picture post production, with an emphasis on visual effects and animation color pipelines.
Generative AI Glossary
https://education.civitai.com/generative-ai-glossary/
Term | Tags | Description |
---|---|---|
.ckpt | Model | “Checkpoint”, a file format created by PyTorch Lightning, a PyTorch research framework. It contains a PyTorch Lightning machine learning model used (by Stable Diffusion) to generate images. |
.pt | Software | A machine learning model file created using PyTorch, containing algorithms used to automatically perform a task. |
.Safetensors | Model | A file format for Checkpoint models, less susceptible to embedded malicious code. See “Pickle” |
ADetailer | Software, Extension | A popular Automatic1111 Extension, mostly used to enhance fine face and eye detail, but can be used to re-draw hands and full characters. |
AGI | Concept | Artificial General Intelligence (AGI), the point at which AI matches or exceeds the intelligence of humans. |
Algorithm | Concept, Software | A series of instructions that allow a computer to learn and analyze data, learning from it, and use that learning to interpret and accomplish future tasks on its own. |
AnimateDiff | Software, Extension | Technique which involves injecting motion into txt2img (or img2img) generations. https://animatediff.github.io/ |
API | Software | Application Programmer Interface – a set of functions and tools which allow interaction with, or between, pieces of software. |
Auto-GPT | Software, LLM | |
Automatic1111 | Developer, SD User Interface | Creator of the popular Automatic1111 WebUI graphical user interface for SD. |
Bard | Software, LLM | Google’s Chatbot, based on their LaMDA model. |
Batch | A subset of the training data used in one iteration of model training. In inference, a group of images. | |
Bias | Concept, LLM | In Large Language Models, errors resulting from training data; stereotypes, attributing certain characteristics to races or groups of people, etc. Bias can cause models to generate offensive and harmful content. |
Bing | Software, LLM | Microsoft’s ChatGTP powered Chatbot. |
CFG | Setting | Classifier Free Guidance, sometimes “Guidance Scale”. Controls how closely the image generation process follows the text prompt. |
Checkpoint | Model | The product of training on millions of captioned images scraped from multiple sources on the Web. This file drives Stable Diffusion’s txt2img, img2img, txt2video |
Civitai (Civitai.com) | Community Resource | Popular hosting site for all types of Generative AI resources. |
Civitai Generator | Software, Tool | Free Stable Diffusion Image Generation Interface, available on Civitai.com. |
Civitai Trainer | Software, Tool | LoRA Training interface, available on Civitai.com, for SDXL and 1.5 based LoRA. |
CLIP | Software | An open source model created by OpenAI. Trained on millions of images and captions, it determines how well a particular caption describes an image. |
Cmdr2 | Developer, SD User Interface | Creator of the popular EasyDiffusion, simple one-click install graphical user interface for SD. |
CodeFormer | Face/Image Restoration, Model | A facial image restoration model, for fixing blurry, grainy, or disfigured faces. |
Colab | Tool | Colaboratory, a product from Google Research, allowing execution of Python code through the browser. Particularly geared towards machine learning applications. https://colab.research.google.com/ |
ComfyUI | SD User Interface, Software | A popular powerful modular UI for Stable Diffusion with a “workflow” type workspace. Somewhat more complex than Auto1111 WebUI https://github.com/comfyanonymous/ComfyUI |
CompVis | Organization | Computer Vision & Learning research group at Ludwig Maximilian University of Munich. They host Stable Diffusion models on Hugging Face. |
Conda | Application, Software | An open source package manager for many programming languages, including Python. |
ControlNet | UI Extension | An Extension to Auto1111 WebUI allowing images to be manipulated in a number of ways. https://github.com/Mikubill/sd-webui-controlnet |
Convergence | Concept | The point in image generation where the image no longer changes as the steps increase. |
CUDA | Hardware, Software | Compute Unified Device Architecture, Nvdia’s parallel processing architecture. |
DALL-E / DALL-E 2 | Organization | Deep learning image models created by OpenAI, available as a commercial image generation service. |
Danbooru | Community Resource | English-based image board website specializing in erotic manga fan art, NSFW. |
Danbooru Tag | Community Resource | System of keywords applied to Danbooru images describing the content within. When using Checkpoint models trained on Danbooru images, keywords can be referenced in Prompts. |
DDIM (Sampler) | Sampler | Denoising Diffusion Implicit Models. See Samplers. |
Deep Learning | Concept | A type of Machine Learning, where neural networks attempt to mimic the behavior of the human brain to perform tasks. |
Deforum | UI Extension, Community Resource | A community of AI image synthesis developers, enthusiasts, and artists, producing Generative AI tools. Most commonly known for a Stable Diffusion WebUI video extension of the same name. |
Denoising/Diffusion | Concept | The process by which random noise (see Seed) is iteratively reduced into the final image. |
depth2img | Concept | Infers the depth of an input image (using an existing model), and then generates new images using both the text and depth information. |
Diffusion Model (DM) | Model | A generative model, used to generate data similar to the data on which they are trained. |
DPM adaptive (Sampler) | Sampler | Diffusion Probabilistic Model (Adaptive). See Samplers. Ignores Step Count. |
DPM Fast (Sampler) | Sampler | Diffusion Probabilistic Model (Fast). See Samplers. |
DPM++ 2M (Sampler) | Sampler | Diffusion Probabilistic Model – Multi-step. Produces good quality results within 15-20 Steps. |
DPM++ 2M Karras (Sampler) | Sampler | Diffusion Probabilistic Model – Multi-step. Produces good quality results within 15-20 Steps. |
DPM++ 2S a Karras (Sampler) | Sampler | Diffusion Probabilistic Model – Single-step. Produces good quality results within 15-20 Steps. |
DPM++ 2Sa (Sampler) | Sampler | Diffusion Probabilistic Model – Single-step. Produces good quality results within 15-20 Steps. |
DPM++ SDE (Sampler) | Sampler | |
DPM++ SDE Karras (Sampler) | Sampler | |
DPM2 (Sampler) | Sampler | |
DPM2 a (Sampler) | Sampler | |
DPM2 a Karras (Sampler) | Sampler | |
DPM2 Karras (Sampler) | Sampler | |
DreamArtist | UI Extension, Software | An extension to WebUI allowing users to create trained embeddings to direct an image towards a particular style, or figure. A PyTorch implementation of the research paper DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning, Ziyi Dong, Pengxu Wei, Liang Lin. |
DreamBooth | Software, Community Resource | Developed by Google Researchers, DreamBooth is a deep learning image generation model designed to fine-tune existing models (checkpoints). Can be used to create custom models based on a set of images. |
DreamStudio | Organization, SD User Interface | A commercial web-based image generation service created by Stability AI using Stable Diffusion models. |
Dropout (training) | Concept | A technique to prevent overfitting by randomly ignoring some images/tokens, etc. during training. |
DyLoRA C3Lier | ||
DyLoRA LierLa | ||
DyLoRA Lycoris | ||
EMA | Model | Exponential Moving Average. A full EMA Checkpoint model contains extra training data which is not required for inference (generating images). Full EMA models can be used to further train a Checkpoint. |
Emad | Organization, Developer | Emad Mostaque, CEO and co-founder of Stability AI, one of the companies behind Stable Diffusion. |
Embedding | Model, UI Extension | Additional file inputs to help guide the diffusion model to produce images that match the prompt. Can be a graphical style, representation of a person, or object. See Textual Inversion and Aesthetic Gradient. |
Emergent Behavior | Concept, LLM | Unintended abilities exhibited by an AI model. |
Entropy | Concept | A measure of randomness, or disorder. |
Epoch | Concept | The number of times a model training process looked through a full data set of images. E.g. The 5th Epoc of a Checkpoint model looked five times through the same data set of images. |
ESRGAN | Upscaler, Model | Enhanced Super-Resolution Generative Adversarial Networks. A technique to reconstruct a higher-resolution image from a lower-resolution image. E.g. upscaling of a 720p image into 1080p. Implemented as a tool within many Stable Diffusion interfaces. |
Euler (Sampler) | Sampler | Named after Leonhard Euler, a numerical procedure for solving ordinary differential equations, See Samplers. |
Euler a (Sampler) | Sampler | Ancestral version of the Euler sampler. Named after Leonhard Euler, a numerical procedure for solving ordinary differential equations, See Samplers. |
Finetune | Concept | |
float16 | Setting, Model, Concept | Half-Precision floating point number. |
float32 | Setting, Model, Concept | Full-Precision floating point number. |
Generative Adversarial Networks (GANs) | Model | A pair of AI models: one generates new data, and the other evaluates its quality. |
Generative AI | Concept | |
GFPGAN | Face/Image Restoration, Model | Generative Facial Prior, a facial restoration model for fixing blurry, grainy, or disfigured faces. |
Git (GitHub) | Application, Software | Hosting service for software development, version control, bug tracking, documentation. |
GPT-3 | Model, LLM | Generative Pre-trained Transformer 3, a language model, using machine learning to produce human-like text, based on an initial prompt. |
GPT-4 | Model, LLM | Generative Pre-trained Transformer 4, a language model, using machine learning to produce human-like text, based on an initial prompt. A huge leap in performance and reasoning capability over GPT 3/3.5. |
GPU | Hardware | A Graphics Processing Unit, a type of processor designed to perform quick mathematical calculations, allowing it to render images and video for display. |
Gradio | Software | A web-browser based interface framework, specifically for Machine Learning applications. Auto1111 WebUI runs in a Gradio interface. |
Hallucinations (LLM) | LLM, Concept | Sometimes LLM models like ChatGPT produce information that sounds plausible but is nonsensical or entirely false. This is called a Hallucination. |
Hash (Checkpoint model) | Model, Concept | An algorithm for verifying the integrity of a file, by generating an alphanumeric string unique to the file in question. Checkpoint models are hashed, and the resulting string can be used to identify that model. |
Heun (Sampler) | Sampler | Named after Karl Heun, a numerical procedure for solving ordinary differential equations. See Samplers. |
Hugging Face | Organization | A community/data science platform providing tools to build, train, and deploy machine learning models. |
Hypernetwork (Hypernet) | Model | A method to guide a Checkpoint model towards a specific theme, object, or character based on its’ own content (no external data required). |
img2img | Concept | Process to generate new images based on an input image, and txt2img prompt. |
Inpainting | Concept | The practice of removing or replacing objects in an image based on a painted mask. |
Kohya | Software | Can refer to Kohya-ss scripts for LoRA/finetuning (https://github.com/kohya-ss/sd-scripts) or the Windows GUI implementation of those scripts (https://github.com/bmaltais/kohya_ss) |
LAION | Organization | A non-profit organization, providing data sets, tools, and models, for machine learning research. |
LAION-5B | Model | A large-scale dataset for research purposes consisting of 5.85 billion CLIP-filtered image-text pairs. |
Lanczos | Upscaler, Model | An interpolation method used to compute new values for sampled data. In this case, used to upscale images. Named after creator, Cornelius Lanczos. |
Large Language Model (LLM) | LLM, Model | A type of Neural Network that learns to write and converse with users. Trained on billions of pieces of text, LLMs excel at producing coherent sentences and replying to prompts in the correct context. They can perform tasks such as re-writing and summarizing text, chatting about various topics, and performing research. |
Latent Diffusion | Model | A type of diffusion model that contains compressed image representations instead of the actual images. This type of model allows the storage of a large amount of data that can be used by encoders to reconstruct images from textual or image inputs. |
Latent Mirroring | Concept, UI Extension | Applies mirroring to the latent images mid-generation to produce anything from subtly balanced compositions to perfect reflections. |
Latent Space | Concept | The information-dense space where the diffusion model’s image representation, attention, and transformation are merged and form the initial noise for the diffusion process. |
LDSR | Upscaler | Latent Diffusion Super Resolution upscaling. A method to increase the dimensions/quality of images. |
Lexica | Community Resource | Lexica.art, a search engine for stable diffusion art and prompts. |
LlamaIndex (GPT Index) | Software, LLM | https://github.com/jerryjliu/llama_index – Allows the connection of text data to an LLM via a generated “index”. |
LLM | LLM, Model | A type of Neural Network that learns to write and converse with users. Trained on billions of pieces of text, LLMs excel at producing coherent sentences and replying to prompts in the correct context. They can perform tasks such as re-writing and summarizing text, chatting about various topics, and performing research. |
LMS (Sampler) | Sampler | |
LMS Karras (Sampler) | Sampler | |
LoCON | ||
LoHa | ||
LoKR | ||
LoRA | Model, Concept | Low-Rank Adaptation, a method of training for SD, much like Textual Inversion. Can capture styles and subjects, producing better results in a shorter time, with smaller output files, than traditional finetuning. |
LoRA C3Lier | ||
LoRA LierLa | ||
Loss (function) | Concept | A measure of how well an AI model’s outputs match the desired outputs. |
Merge (Checkpoint) | Model | A process by which Checkpoint models are combined (merged) to form new models. Depending on the merge method (see Weighted Sum, Sigmoid) and multiplier, the merged model will retain varying characteristics of its’ constituent models. |
Metadata | Concept, Software | Metadata is data that describes data. In the context of Stable Diffusion, metadata is often used to describe the Prompt, Sampler settings, CFG, steps, etc. which are used to define an image, and stored in a .png header. |
MidJourney | Organization, SD User Interface | A commercial web-based image generation service, similar to DALL-E, or the free, open source, Stable Diffusion. |
Model | Model | Alternative term for Checkpoint |
Motion Module | Software | Used by AnimateDiff to inject motion into txt2img (or img2img) generations. |
Multimodal AI | Concept | AI that can process multiple types of inputs, including text, images, video or speech. |
Negative Prompt | Setting, Concept | Keywords which tell a Stable Diffusion prompt what we don’t want to see, in the generated image. |
Neural Network | Concept, Software | Mathematical systems that act like a human brain, with layers of artificial “neurons” helping find connections between data. |
Notebook | Community Resource, Software | See Colab. A Jupyter notebook service providing access, free of charge, to computing resources including GPUs. |
NovelAI (NAI) | Organization | A paid, subscription based AI-assisted story (text) writing service. Also has a txt2img model, which was leaked and is now incorporated into many Stable Diffusion models. |
Olivio (Sarikas) | Community Resource | Olivio produces wonderful SD content on YouTube (https://www.youtube.com/@OlivioSarikas) – one of the best SD news YouTubers out there! |
OpenAI | Organization | AI research laboratory consisting of the for-profit corporation OpenAI LP and the non-profit OpenAI Inc. |
OpenPose | Model, Software | A method for extracting a “skeleton” from an image of a person, allowing poses to be transferred from one image to another. Used by ControlNet. |
Outpainting | Concept | The practice of extending the outer border of an image, into blank canvas space, while maintaining the style and content of the image. |
Overfitting | Concept | When an AI model learns the training data too well and performs poorly on unseen data. |
Parameters (LLMs) | Concept, Software, LLM | Numerical points across a Large Language Model’s training data. Parameters dictate how proficient the model is at its tasks. E.g. a 6B (Billion) Parameter model will likely perform less well than a 13B Parameter model. |
Pickle | Concept, Software | Community slang term for potentially malicious code hidden within models and embeddings. To be “pickled” is to have unwanted code execute on your machine (be hacked). |
PLMS (Sampler) | Sampler | Pre-Trained Language Models. See Samplers. |
Prompt | Concept | Text input to Stable Diffusion describing the particulars of the image you would like output. |
Pruned/Pruning | Model | A method of optimizing a Checkpoint model to increase the speed of inference (prompt generation), file size, and VRAM cost. |
Python | Application, Software | A popular, high-level, general purpose coding language. |
PyTorch | Application, Software | An open source machine learning library, created by META. |
Real-ESRGAN | Upscaler | An image restoration method. |
Refiner | Model | Part of SDXL’s two-stage pipeline – the Refiner further enhances detail from the base model. |
SadTalker | UI Extension | https://github.com/OpenTalker/SadTalker A framework for facial animation/lip synching based upon an audio input. |
Samplers | Sampler | Mathematical functions providing different ways of solving differential equations. Each will produce a slightly (or significantly) different image result from the random latent noise generation. |
Sampling Steps | Sampler, Concept | The number of how many steps to spend generating (diffusing) your image. |
SD 1.4 | Model | A latent txt2img model, the default model for SD at release. Fine-tuned on 225k steps at resolution 512×512 on laion-aesthetics v2 data set. |
SD 1.5 | Model | A latent txt2img model, updated version of 1.4, fine-tuned on 595k steps at resolution 512×512 on laion-aesthetics v2 data set. |
SD UI | Application, Software | Colloquial term for Cmdr2’s popular graphical interface for Stable Diffusion prompting. |
SD.Next | Software | See Vlad, Vladmandic Fork of Auto1111 WebUI. |
SDXL 0.9 | Model | Stability AI’s latest (March 2023) Stable Diffusion Model. Will become SDXL 1.0 and be released ~July 2023. |
Seed | Concept | A pseudo-random number used to initialize the generation of random noise, from which the final image is built. Seeds can be saved and used along with other settings to recreate a particular image. |
Shoggoth Tongue | Concept, LLM | A humorous allusion to the language of the fictional monsters in the Cthulhu Mythos, “Shoggoth Tongue” is the name given to advanced ChatGPT commands which are particularly arcane and difficult to understand, but allow ChatGPT to perform advanced actions outside of the intended operation of the system. |
Sigmoid (Interpolation Method) | Model, Concept | A method for merging Checkpoint Models based on a Sigmoid function – a mathematical function producing an “S” shaped curve. |
Stability AI | Organization | AI technology company co-founded by Emad Mostaque. One of the companies behind Stable Diffusion. |
Stable Diffusion (SD) | Application, Software | A deep learning, text-to-image model released in 2022. It is primarily used to generate detailed images based on provided text descriptions. |
SwinIR | Face/Image Restoration, Model | An image restoration transform, aiming to restore high quality images from low quality images. |
Tensor | Software | A container, in which multi-dimensional data can be stored. |
Tensor Core | Hardware | Processing unit technology developed by Nvidia, designed to carry out matrix multiplication, an arithmetic operation. |
Textual Inversion | Model, Concept, UI Extension | A technique for capturing concepts from a small number of sample images in a way that can influence txt2img results towards a particular face, or object. |
Token | Concept | A token is roughly a word, a punctuation, or a Unicode character in a prompt. |
Tokenizer | Concept, Model | The process/model through which text prompts are turned into tokens, for processing. |
Torch 2.0 | Software | The latest (March 2023) PyTorch release. |
Training | Concept | The process of teaching an AI model by feeding it data and adjusting its parameters. |
Training Data | Model | A set of many images used to “train” a Stable Diffusion model, or embedding. |
Training Data | Concept, LLM, Model | The data sets uses to help AI models learn; can be text, images, code, or other data, depending on the type of model to be trained. |
Turing Test | Concept | Named after mathematician Alan Turing, a test of a machine’s ability to behave like a human. The machine passes if a human can’t distinguish the machine’s response from another human. |
txt2img | Concept, Model | Model/method of image generation via entry of text input. |
txt2video | Concept, Model | Model/method of video generation via entry of text input. |
Underfitting | When an AI model cannot capture the underlying pattern of the data due to incomplete training. | |
UniPC (Sampler) | Sampler | A recently released (3/2023) sampler based upon https://huggingface.co/docs/diffusers/api/schedulers/unipc |
Upscale | Upscaler, Concept | The process of converting low resolution media (images or video) into higher resolution media. |
VAE | Model | Variational Autoencoder. A .vae.pt file which accompanies a Checkpoint model and provides additional detail improvements. Not all Checkpoints have an associated vae file, and some vae files are generic and can be used to improve any Checkpoint model. |
Vector (Prompt Word) | Concept | An attempt to mathematically represent the meaning of a word, for processing in Stable Diffusion. |
Venv | Software | A Python “Virtual Environment” which allows multiple instances of python packages to run, independently, on the same PC. |
Vicuna | LLM, Software, Model | https://vicuna.lmsys.org/ An Open-Source Chatbot model founded by students and faculty from UC Berkeley in collaboration with UCSD and CMU. |
Vladmandic | Software, SD User Interface | A popular “Fork” of Auto1111 WebUI, with its own feature-set. https://github.com/vladmandic/automatic |
VRAM | Hardware | Video random access memory. Dedicated Graphics Card (GPU) memory used to store pixels, and other graphical processing data, for display. |
Waifu Diffusion | Model | A popular text-to-image model, trained on high quality anime images, which produces anime style image outputs. Originally produced for SD 1.4, now has an SDXL version. |
WebUI | Application, Software, SD User Interface | Colloquial term for Automatic1111’s WebUI – a popular graphical interface for Stable Diffusion prompting. |
Weighted Sum (Interpolation Method) | Concept | A method of Checkpoint merging using the formula Result = ( A * (1 – M) ) + ( B * M ) . |
Weights | Model | Alternative term for Checkpoint |
Wildcards | Concept | Text files containing terms (clothing types, cities, weather conditions, etc.) which can be automatically input into image prompts, for a huge variety of dynamic images. |
xformers | UI Extension, Software | Optional library to speed up image generation. Superseded somewhat by new options implemented by Torch 2.0 |
yaml | Software, UI Extension, Model | A human-readable data-serialization programming language commonly used for configuration files. Yaml files accompany Checkpoint models, and provide Stable Diffusion with additional information about the Checkpoint. |
Python NumPy: the absolute basics for beginners
https://numpy.org/doc/stable/user/absolute_beginners.html
NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.
The NumPy library contains multidimensional array and matrix data structures (you’ll find more information about this in later sections). It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.
Netflix removes movie noise, saves 30% bandwidth and adds it back again
https://www.slashcam.com/news/single/Netflix-removes-movie-noise--saves-30--bandwidth-a-17337.html
”’Filmmaker Parker Gibbons has drawn attention to a very interesting fact: Netflix removes film noise before streaming its movies and artificially adds it back when decoding. This is because digitally shot films are actually free of any film grain, the very specific (not to be confused with noise caused by too little light) noise that occurs in analog filming. But this type of noise has become so associated with “real” motion pictures through the long history of film (as a component of the film look) that it is unconsciously perceived by many viewers as an important feature of a motion picture.
…
This leads to a difficult-to-resolve contradiction between, on the one hand, film material that is as compressible and noise-free as possible, and, on the other hand, the noise caused by film grain that is desirable for the film look. Netflix has found a very special solution to resolve this contradiction. It uses a very special function of the open source AV1 video codec, which Netflix has been using for a long time, namely the artificial synthesis of film grain. Thus, film noise is first analyzed using statistical methods before compression and then removed for efficient compression. According to Netflix, this saves around 30% of the data during transmission.”’
online real time collaborative text editor
a highly customizable Open Source online editor providing collaborative editing in really real-time
Processing – a flexible software sketchbook
Processing is a flexible software sketchbook and a language for learning how to code within the context of the visual arts. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. There are tens of thousands of students, artists, designers, researchers, and hobbyists who use Processing for learning and prototyping.
» Free to download and open source
» Interactive programs with 2D, 3D or PDF output
» OpenGL integration for accelerated 2D and 3D
» For GNU/Linux, Mac OS X, Windows, Android, and ARM
» Over 100 libraries extend the core software