Copyright traps (see Meeus et al. (ICML 2024)) are unique, synthetically generated sequences who have been included into the training dataset of CroissantLLM. This dataset allows for the evaluation of Membership Inference Attacks (MIAs) using CroissantLLM as target model, where the goal is to infer whether a certain trap sequence was either included in or excluded from the training data.
This dataset contains non-member (label=0) and member (label=1) trap sequences, which have been generated using this code and by sampling text from LLaMA-2 7B while controlling for sequence length and perplexity. The dataset contains splits according to seq_len_{XX}_n_rep_{YY} where sequences of XX={25,50,100} tokens are considered and YY={10, 100, 1000} number of repetitions for member sequences. Each dataset also contains the ‘perplexity bucket’ for each trap sequence, where the original paper showed that higher perplexity sequences tend to be more vulnerable.
Note that for a fixed sequence length, and across various number of repetitions, each split contains the same set of non-member sequences (n_rep=0). Also additional non-members generated in exactly the same way are provided here, which might be required for some MIA methodologies making additional assumptions for the attacker.
Synchron is building a brain-computer interface, or a BCI, designed to help patients with paralysis operate technology like smartphones and computers with their minds.
The financial terms of the deal weren’t disclosed, but Canva co-founder and chief product officer Cameron Adams said it’s a mix of cash and stock. All of Leonardo.ai’s 120 employees will be joining Canva, including the executive team.
“Leonardo will continue to run independently of Canva with a focus on rapid innovation, research and development, now backed by Canva’s resources,” Adams told TechCrunch. “We’ll keep offering all of Leonardo’s existing tools and solutions. This acquisition aims to help Leonardo develop its platform and deepen their user growth with our investment, including by expanding their API business and investing in foundational model R&D.”
This strategic move supports Autodesk’s goal to democratize creative tools and foster innovation in the media and entertainment industry. Terms of the deal were not disclosed.
Open-source fonts packaged into individual NPM packages for self-hosting in web applications. Self-hosting fonts can significantly improve website performance, remain version-locked, work offline, and offer more privacy.
🔹 1943: 𝗠𝗰𝗖𝘂𝗹𝗹𝗼𝗰𝗵 & 𝗣𝗶𝘁𝘁𝘀 create the first artificial neuron. 🔹 1950: 𝗔𝗹𝗮𝗻 𝗧𝘂𝗿𝗶𝗻𝗴 introduces the Turing Test, forever changing the way we view intelligence. 🔹 1956: 𝗝𝗼𝗵𝗻 𝗠𝗰𝗖𝗮𝗿𝘁𝗵𝘆 coins the term “Artificial Intelligence,” marking the official birth of the field. 🔹 1957: 𝗙𝗿𝗮𝗻𝗸 𝗥𝗼𝘀𝗲𝗻𝗯𝗹𝗮𝘁𝘁 invents the Perceptron, one of the first neural networks. 🔹 1959: 𝗕𝗲𝗿𝗻𝗮𝗿𝗱 𝗪𝗶𝗱𝗿𝗼𝘄 and 𝗧𝗲𝗱 𝗛𝗼𝗳𝗳 create ADALINE, a model that would shape neural networks. 🔹 1969: 𝗠𝗶𝗻𝘀𝗸𝘆 & 𝗣𝗮𝗽𝗲𝗿𝘁 solve the XOR problem, but also mark the beginning of the “first AI winter.” 🔹 1980: 𝗞𝘂𝗻𝗶𝗵𝗶𝗸𝗼 𝗙𝘂𝗸𝘂𝘀𝗵𝗶𝗺𝗮 introduces Neocognitron, laying the groundwork for deep learning. 🔹 1986: 𝗚𝗲𝗼𝗳𝗳𝗿𝗲𝘆 𝗛𝗶𝗻𝘁𝗼𝗻 and 𝗗𝗮𝘃𝗶𝗱 𝗥𝘂𝗺𝗲𝗹𝗵𝗮𝗿𝘁 introduce backpropagation, making neural networks viable again. 🔹 1989: 𝗝𝘂𝗱𝗲𝗮 𝗣𝗲𝗮𝗿𝗹 advances UAT (Understanding and Reasoning), building a foundation for AI’s logical abilities. 🔹 1995: 𝗩𝗹𝗮𝗱𝗶𝗺𝗶𝗿 𝗩𝗮𝗽𝗻𝗶𝗸 and 𝗖𝗼𝗿𝗶𝗻𝗻𝗮 𝗖𝗼𝗿𝘁𝗲𝘀 develop Support Vector Machines (SVMs), a breakthrough in machine learning. 🔹 1998: 𝗬𝗮𝗻𝗻 𝗟𝗲𝗖𝘂𝗻 popularizes Convolutional Neural Networks (CNNs), revolutionizing image recognition. 🔹 2006: 𝗚𝗲𝗼𝗳𝗳𝗿𝗲𝘆 𝗛𝗶𝗻𝘁𝗼𝗻 and 𝗥𝘂𝘀𝗹𝗮𝗻 𝗦𝗮𝗹𝗮𝗸𝗵𝘂𝘁𝗱𝗶𝗻𝗼𝘃 introduce deep belief networks, reigniting interest in deep learning. 🔹 2012: 𝗔𝗹𝗲𝘅 𝗞𝗿𝗶𝘇𝗵𝗲𝘃𝘀𝗸𝘆 and 𝗚𝗲𝗼𝗳𝗳𝗿𝗲𝘆 𝗛𝗶𝗻𝘁𝗼𝗻 launch AlexNet, sparking the modern AI revolution in deep learning. 🔹 2014: 𝗜𝗮𝗻 𝗚𝗼𝗼𝗱𝗳𝗲𝗹𝗹𝗼𝘄 introduces Generative Adversarial Networks (GANs), opening new doors for AI creativity. 🔹 2017: 𝗔𝘀𝗵𝗶𝘀𝗵 𝗩𝗮𝘀𝘄𝗮𝗻𝗶 and team introduce Transformers, redefining natural language processing (NLP). 🔹 2020: OpenAI unveils GPT-3, setting a new standard for language models and AI’s capabilities. 🔹 2022: OpenAI releases ChatGPT, democratizing conversational AI and bringing it to the masses.
The focal length of an optical system is a measure of how strongly the system converges or diverges light.
Without getting into an in-depth physics discussion, the focal length of a lens is an optical property of the lens.
The exact definition is:Focal length measures the distance, in millimeters, between the “nodal point” of the lens and the camera’s sensor.
About 576 megapixels for the entire field of view.
Consider a view in front of you that is 90 degrees by 90 degrees, like looking through an open window at a scene. The number of pixels would be:
90 degrees * 60 arc-minutes/degree * 1/0.3 * 90 * 60 * 1/0.3 = 324,000,000 pixels (324 megapixels).
At any one moment, you actually do not perceive that many pixels, but your eye moves around the scene to see all the detail you want. But the human eye really sees a larger field of view, close to 180 degrees. Let’s be conservative and use 120 degrees for the field of view. Then we would see: