תיאור המשרה
Data scientist - Agentic AI
This role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE office.
Who We Are:
Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.
Job Description:
Data Scientist – Agentic AI
The Data Scientist – Agentic AI builds and operationalizes the core agentic workflows that power Marvis, Juniper's next-generation AI assistant for network operations. Working at the intersection of data science, generative AI, and production engineering, this role is responsible for designing, implementing, and evaluating the reasoning pipelines, tool-calling patterns, skills, and MCP server integrations that enable Marvis to autonomously diagnose, troubleshoot, and resolve complex networking problems. The ideal candidate combines deep hands-on experience with LLM-based agentic frameworks (LangGraph preferred) with the software engineering rigor needed to ship reliable, observable AI systems in a cloud-native environment.
Management Level Definition: Contributions impact technical components of products, solutions, or services regularly and sustainably. Applies advanced subject matter knowledge to solve complex business and technical problems and is regarded as a subject matter expert in agentic AI and applied GenAI. Provides expertise and partnership to functional and technical project teams and may participate in cross-functional initiatives. Exercises significant independent judgment to determine best method for achieving objectives. May provide team leadership and mentoring to others.
Responsibilities:
Design, implement, and iterate on agentic workflows using LangGraph, including ReACT orchestration loops, dynamic tool selection and binding, multi-step reasoning, and self-correction patterns.
Develop and maintain MCP (Model Context Protocol) servers and skills — defining tool schemas, implementing domain-specific tools, writing skill playbooks (SKILL.md), and managing server lifecycle (versioning, deployment, monitoring).
Integrate and optimize LLM capabilities at production scale, including structured outputs, streaming, function/tool calling, prompt engineering, and robust error handling across agent execution paths.
Build and refine retrieval and memory services for agentic systems, including RAG pipelines, vector-store-backed semantic search, hybrid retrieval, long-term agent memory (semantic, episodic, procedural), and relevance tuning.
Design and execute evaluation frameworks for non-deterministic agentic systems — defining metrics, building test harnesses, running A/B tests on skills and tool configurations, and driving continuous quality improvement.
Collaborate with domain experts (network engineers, product managers) to formalize networking problems as agentic workflows, translating troubleshooting playbooks into skills, tools, and data pipelines.
Develop data analysis and transformation logic that runs in sandboxed execution environments (Code Mode), including multi-tool orchestration scripts, data aggregation, and visualization.
Deploy and operate containerized services in Kubernetes, contributing to CI/CD pipelines, container image management, health probes, and resource optimization.
Own observability for agentic workflows — implementing tracing, logging, cost tracking, and performance monitoring to ensure reliability of non-deterministic systems in production.
Education and Experience Required:
Master's or PhD degree in computer science, data science, mathematics, statistics, or a closely related quantitative discipline.
Typically, 4–6 years of experience building production ML/AI systems, with at least 1–2 years of hands-on work with generative AI and LLM-based applications.
Knowledge and Skills:
Agentic AI & GenAI (Required):
Production experience with agentic orchestration frameworks: LangGraph (strongly preferred), LangChain, Claude Agent SDK, or equivalent — beyond prototypes.
Solid understanding of agentic design patterns: ReACT loops, tool/function calling, dynamic tool binding, skill-based execution, multi-step planning, and self-correction.
Hands-on experience with MCP (Model Context Protocol) or equivalent tool-serving protocols: tool schema design, server implementation, registry management.
LLM API integration at scale: prompt engineering, structured outputs, streaming, error handling, and cost optimization.
RAG pipeline design: chunking strategies, re-ranking, hybrid search, vector stores (OpenSearch or equivalent), and relevance optimization.
Experience building evaluation and testing frameworks for non-deterministic AI systems (offline evals, A/B testing, LLM-as-judge).
Data Science & ML (Required):
Strong foundation in statistical and machine learning techniques — anomaly detection, time-series analysis, clustering, causal inference, or related methods.
Applied ML intuition: knowing when to use retrieval vs. fine-tuning, prompt engineering vs. structured generation, and how to debug model behavior in production.
Proficient Python developer with experience in production codebases (not just notebooks).
Infrastructure & Production Systems (Required):
Kubernetes: deploying, scaling, and managing workloads (Deployments, Services, ConfigMaps, Secrets, health probes).
CI/CD pipelines for automated build, test, and deploy (Jenkins, GitHub Actions, ArgoCD, or similar).
Container image management: building, tagging, versioning via Docker; familiarity with a container registry (ECR, GCR).
Backend service development: FastAPI or equivalent; REST/GraphQL API design.
Observability for AI systems: experience with tracing, monitoring, and logging tools (LangFuse, Prometheus, or equivalent).
Additional Preferred Skills:
Experience with agent memory systems (e.g., LangMem, custom memory architectures).
Familiarity with sandboxed code execution environments (E2B, Firecracker, or similar).
Networking domain knowledge (wireless/wired diagnostics, network troubleshooting) is a strong plus but not required.
Experience with AWS Bedrock, OpenSearch Serverless, or similar managed AI/ML services.
Great written and verbal communication skills; ability to articulate technical designs to senior leadership.
What We Can Offer You:
Health & Wellbeing
We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
Personal & Professional Development
We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.
Unconditional Inclusion
We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.
Let's Stay Connected:
Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.
Job:
EngineeringJob Level:
TCP_04
"The expected salary/wage range for this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education/training, and/or skill level.
– United States of America: Annual Salary USD 155,500 - 315,000 in California
The listed salary range reflects base salary. Variable incentives may also be offered."
Information about employee benefits offered in the US can be found at https://myhperewards.com/main/new-hire-enrollment.html
HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.
Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.
HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.
Recruitment Fraud Alert
We have become aware of an increase in fraudulent recruitment activities in which individuals impersonate our company or authorized recruitment agencies to offer fake employment opportunities. These scams may occur through false websites, emails, social media, or chat-based applications and often aim to obtain personal information or money. Please note that Hewlett Packard Enterprise (HPE), its direct and indirect subsidiaries and affiliated companies, and its authorized recruitment agencies/vendors will never charge a candidate a registration fee, hiring fee, or any other fee in connection with its recruitment and hiring process. We also never request personal information such as back account details, Social Security numbers, or national IDs via social media or chat applications.
All legitimate job opportunities will come through official company channels, and candidates are responsible for verifying the credentials of any third party claiming to represent the company. Any reliance on fraudulent communication is at the individual’s own risk, and HPE disclaims legal liability for any resulting damages. If you suspect recruitment fraud, do not share personal information or make any payments and report the incident to your local authorities immediately.