Auditing datasets is essential, spend a substantial amount of time inspecting your dataset at multiple stages of the dataset design process. Many datasets have problems specifically because the authors did not do sufficient auditing before releasing them. Use systematic studies of the process in addition to data search, analysis, & exploration tools to track the dataset's evolution.
9 Data Auditing Resources for Foundation Models
- Home /
- Foundation Model Resources /
- Data Auditing Resources for Foundation Models
Data Auditing
Text 9
Speech 1
Vision 5
Multimodal datasets: misogyny, pornography, and malignant stereotypes
Auditing vision datasets for sensitive content.
Text VisionOn Hate Scaling Laws For Data-Swamps
Auditing text and vision datasets for systemic biases and hate.
Text Vision