Posted 3 months ago

To apply for this job please visit www.linkedin.com.

3+ years of experience in applied data science or machine learning roles, including hands-on work with Python, NLP, and LLM implementation.
3+ years of experience in data exploration, cleaning, analysis, visualization, or data mining.
Experience working with production-grade systems, data lake environments, and streaming data technologies such as Kafka.
Proven ability to build and deploy end-to-end ML workflows—from data preparation through model deployment and evaluation.
Strong aptitude for learning new infrastructure or systems concepts, including pipeline integrations with data lake architectures.
Demonstrated capability in designing, implementing, and iterating on ML models for document classification, information extraction, summarization, and search.
Ability to own and maintain data science workflows that support a production system processing millions of documents weekly.
Active TS/SCI clearance.
Bachelor’s degree.

Experience collaborating closely with MLOps and infrastructure teams to deliver reliable model deployment, monitoring, and retraining pipelines.
Experience supporting platform components such as document indexing, search systems, GPU workloads, or distributed storage environments (e.g., Cloudera).
Background in developing algorithms using R, Python, SQL, or NoSQL technologies.
Familiarity with distributed data and computing tools, such as MapReduce, Hadoop, Hive, EMR, Spark, Gurobi, or MySQL.
Experience with data visualization libraries such as Plotly, Seaborn, or ggplot2.
Security+ certification.