Daniel Otero

Daniel Otero

AI Engineer · Data Scientist · Applied Researcher

I build production systems across three fronts — agentic AI (LangGraph, RAG, multi-agent), NLP and text analytics (semantic search, embeddings, bibliometric networks), and applied data science (clustering, dashboards, pipelines) — for research and product teams across Latin America.

About

Bridging social science and AI

I'm an economist and computer-science engineer (M.Sc.) working across three fronts: agentic AI (LLM orchestration, RAG, multi-agent systems), NLP and text analytics (semantic search, embeddings, bibliometric networks), and applied data science (clustering, statistical modeling, dashboards). My path moves between them — sometimes within a single project.

That breadth means I do the technical work and understand the social, organizational, and research context behind it. I've shipped conversational agents serving hundreds of users monthly across Latin America, ML clustering pipelines for survey research, RAG systems with vector search, and 6 monitoring dashboards across 4 countries for data-capture and impact-evaluation processes.

Currently leading data science and AI at Estudio Plural — designing LLM-based tools for behavioral research, knowledge retrieval, and organizational intelligence. I publish peer-reviewed work on bibliometric NLP, teach, and consult on applied research projects when there's a good fit.

700+

Active WhatsApp bot users / month

Countries reached with data systems

104K

Nodes in bibliometric citation network

Peer-reviewed publications

What I work with

Agentic AI & LLMs

LangChainLangGraph RAGMulti-agent Prompt EngineeringFine-tuning Hugging FaceOpenRouter

NLP & Text Analytics

EmbeddingsSemantic Search Text ClassificationSentiment Analysis Network AnalysisBibliometrics

Data Science & Stats

PythonR Pandasscikit-learn PlotlyClustering PCAStatistical Modeling

Agentic Coding Systems

Claude CodeCodex OpenCode

Infrastructure & Storage

FastAPIStreamlit Next.jsDocker GitHub Actionsn8n TwilioPostgreSQL MongoDBQdrant Neo4j

Experience

Where I've worked

Estudio Plural

Data Science & AI Specialist Jan 2025 – Present
Behavioral Research & Analytics Lead Jun 2024 – Dec 2024
Data Analytics Consultant Dec 2023 – May 2024
Data Analytics Consultant Jul 2023 – Aug 2023

Built 2 conversational agents with LangGraph: first deployed across 4 countries (CO/EC/PE/BO) with ~400 users/month; second active in CO & MX with ~300 users/month.
Real-time monitoring dashboards with leaderboards and automatic report generation, giving researchers immediate access to bot usage metrics with no manual extraction.
Automation flows in n8n and Zapier for admin and accounting → 80% time saved on repetitive tasks.
AI system for automatic detection of funding opportunities → 10 hours/week saved for the project formulation team.
Python data pipelines connecting KoboToolbox & Typeform to Supabase dashboards → ~90% reduction in field-data monitoring time.
Multi-agent processing system for clustering and behavioral narrative generation over survey data → 50% reduction in analysis time.

Glasswing International

AI & Automation Consultant May 2025 – Jul 2025

Designed and shipped an end-to-end accounting-automation system that replaces the accountant's manual forwarding and review of supplier invoices → ~12 hours/week of manual work freed. The routing worker (Python, Gmail API + SQLite) classifies each incoming invoice and routes it to the right coordinators from a matrix of 323 suppliers and ~1,900 assignments, running in production on a VPS (Docker Compose behind Tailscale) with Telegram alerts.
Built the automatic payment-package verification module: it extracts and cross-checks multiple documents (invoice, RUT, chamber-of-commerce, bank certification, transfer/reimbursement request) with Claude via OpenRouter (text + vision for scans), robust to how attachments arrive (merged or separate). It flags amount, NIT and beneficiary mismatches and replies approving or returning with corrections → each review cut from ~15 min to under 1 min.

Octopus Force

Project Analyst Jan 2025 – Mar 2026
Research Leader Jul 2023 – Dec 2024

Built SGR (General Royalties System) monitoring dashboard integrated with the datos.gov.co Open Data API → 8 hours/week saved for the project formulation team.
Developed a prompt library for technology surveillance → research time per report cut from 8 to 3 days (-63%), applied across ~20 reports for companies in Valle del Cauca.
Deployed intelligent agents for information synthesis and organization across research, search, and project formulation in public and corporate contexts.
Built MVP of a multi-agent assistant for document management, focused on classification and efficient access to technical and administrative documents.

Universidad del Valle, CIDSE

Data Analysis Consultant Nov 2024 – Dec 2024
Data Analytics & Experimental Design Consultant Oct 2023 – Dec 2023
Shiny Developer Oct 2020 – Dec 2020

Sample design, construction and deployment of experimental surveys in oTree, with results processing using clustering algorithms for computational social science projects.
Narrative and social-network analysis using NLP and text mining in R.
Interactive Shiny dashboards for non-technical research teams.

Tell Business Storytelling

Data Analytics Consultant Mar 2024 – Jun 2024
Data Scientist Mar 2020 – Dec 2021

Designed and analyzed an end-to-end Typeform survey including final report → 50% time reduction vs. previous process.
Automated survey processing and report generation across 6 countries (CL, SV, CO, MX, UY, PE) → 90% time saved for the research team.
Built data capture system via Twitter API + Google Trends/News → weekly collection cut from 2 days to 20 minutes.
NLP and text mining pipelines for sentiment analysis, clustering, and user-persona construction over the captured data.

Universidad del Valle, CINARA Institute

Quantitative Analytics Lead · PUDA2022 Project Sep 2021 – Jun 2023

Sample design and construction of socio-environmental surveys; processed results applying PCA and clustering for data characterization in water and sanitation contexts.
Network models and fuzzy logic systems applied to complex socio-environmental systems.

Fundación Univalle

Advisor Sep 2020 – Nov 2020

Applied clustering and PCA over SISBEN IV data for categorization and georeferencing of vulnerable population, with technical reporting.

Directrix Analytics

Data Scientist · HORIZONT Project Nov 2018 – Apr 2019

Automated industrial sensor data capture from a kiln at the ARGOS Yumbo (Valle del Cauca) plant, enabling continuous monitoring of process variables.
PCA and predictive models on the captured data; visualization dashboards for plant teams.

CIDSE, Universidad del Valle

Research Assistant Jan 2017 – Jul 2020

Automated download of scientific citation data from the RePEc API, building a network of 104,589 nodes → 6 months of manual work saved.
Built semantic models and citation networks in R and Python for bibliometric and influence analysis in economics.
Co-author of 4 peer-reviewed publications in international journals (see Publications).

Publications

Bibliometric NLP and citation-network analysis applied to economic discourse — 104K+ nodes across four peer-reviewed studies.

The Drifting Influence of Hall's Random Walk Hypothesis on Consumption Modeling

García, C., Otero, D. & Salazar, B. · History of Political Economy, 55(1), 103–143 · 2023

doi.org/10.1215/00182702-10213653 ↗

A Tale of a Tool: The Impact of Sims's Vector Autoregressions on Macroeconometrics

Salazar, B. & Otero, D. · History of Political Economy, 51(3), 557–578 · 2019

doi.org/10.1215/00182702-7551924 ↗

La revolución empírica en economía

Salazar, B. & Otero, D. · Apuntes del CENES, 38(68) · 2019

doi.org/10.19053/01203053.v38.n68.2019.8792 ↗

La revolución de los nuevos clásicos: redes, influencia y metodología

Salazar, B. & Otero, D. · Revista de Economía Institucional, 17(32), 39–69 · 2015

doi.org/10.18601/01245996.v17n32.02 ↗

How I can help

Hire me to take an idea from prototype to production — across conversational AI, data systems, and applied research.

Conversational AI & Agents

Production chatbots and multi-agent assistants — multilingual, with RAG over your documents and conversation memory. Deployed where your users already are.

WhatsAppWeb RAGLangGraph Multi-agent

Data Pipelines & Dashboards

From messy field data to decisions your team can act on: ingestion, automated validation and QC, and live dashboards they'll actually use.

KoboToolboxStreamlit PostgreSQLAutomation

NLP & Applied Research

Research-grade text and data analysis — semantic search, classification, sentiment, clustering and network analysis over surveys, documents and organizational data.

EmbeddingsSemantic Search ClusteringNetworks