Required Qualifications
-
3+ years of experience in applied data science or machine learning roles, including hands-on work with Python, NLP, and LLM implementation.
-
3+ years of experience in data exploration, cleaning, analysis, visualization, or data mining.
-
Experience working with production-grade systems, data lake environments, and streaming data technologies such as Kafka.
-
Proven ability to build and deploy end-to-end ML workflows—from data preparation through model deployment and evaluation.
-
Strong aptitude for learning new infrastructure or systems concepts, including pipeline integrations with data lake architectures.
-
Demonstrated capability in designing, implementing, and iterating on ML models for document classification, information extraction, summarization, and search.
-
Ability to own and maintain data science workflows that support a production system processing millions of documents weekly.
-
Active TS/SCI clearance.
-
Bachelor’s degree.
Additional Qualifications
-
Experience collaborating closely with MLOps and infrastructure teams to deliver reliable model deployment, monitoring, and retraining pipelines.
-
Experience supporting platform components such as document indexing, search systems, GPU workloads, or distributed storage environments (e.g., Cloudera).
-
Background in developing algorithms using R, Python, SQL, or NoSQL technologies.
-
Familiarity with distributed data and computing tools, such as MapReduce, Hadoop, Hive, EMR, Spark, Gurobi, or MySQL.
-
Experience with data visualization libraries such as Plotly, Seaborn, or ggplot2.
-
Security+ certification.
