Job Description
Job Description
As a GPU Software Machine Learning Engineer, you are responsible for working with a team that is developing & optimizing Adreno GPU drivers for standard APIs such as OpenCL, OpenGL ES, Vulkan, and DirectX. You will be exposed to technology areas such as Image Processing and Machine Learning.
AI GPU Runtime Software Engineer
Qualifications
- An ideal candidate should be familiar with GPU runtime APIs, GPU drivers, GPU architectures, OS, parallel/asynchronous programming, efficient resource management
- He/she should be comfortable at performing quantitative analysis of workload and drive improvements at suitable software stack layers
- Most importantly, the candidate is willing to learn and work across boundaries
- BS/MS (Computer Science, Computer Engineering, Electrical Engineering, or related equivalent)
Benefits
- At AMD, your base pay is one part of your total rewards package
- Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position
- You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive
- Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan
- You’ll also be eligible for competitive benefits described in more detail here
Responsibilities
- Design, develop, and maintain GPU related runtime implementations in IREE over HIP, CUDA, Vulkan, DirectX, Metal
- Design, develop, and maintain multi-GPU runtime and communication solutions including collectives
- Manage testing and releasing of runtime components
- Quantitively analyze end-to-end model performance, identify bottlenecks, propose ideas to improve, prototype and productionize solutions
- Design and implement compiler passes to better schedule and utilize resources
- Design and implement Python interactions with runtime components
- Drive towards general solutions that benefit different all GPU targets and the overall community
The Role
We are building IREE as an open-source compiler and runtime solution to productionize ML on a variety of usage scenarios and hardware targets. Among them, having wide and performant GPU support is critical. We aim at a broad range of GPU coverage, from mobile to datacenter, via a unified software stack. It requires us to write the most efficient code to interact with the OS and device drivers with minimal dependency and small binary size. There will be no short of intriguing technical challenges to tackle, and there are abundant chances to collaborate with industry experts working at different layers of the stack. If this sounds interesting to you, please don’t hesitate to reach out to us!
The Person
An ideal candidate should be familiar with GPU runtime APIs, GPU drivers, GPU architectures, OS, parallel/asynchronous programming, efficient resource management. He/she should be comfortable at performing quantitative analysis of workload and drive improvements at suitable software stack layers. Most importantly, the candidate is willing to learn and work across boundaries.
Key Responsibilities:
• Design, develop, and maintain GPU related runtime implementations in IREE over HIP, CUDA, Vulkan, DirectX, Metal.
• Design, develop, and maintain multi-GPU runtime and communication solutions including collectives
• Manage testing and releasing of runtime components
• Quantitively analyze end-to-end model performance, identify bottlenecks, propose ideas to improve, prototype and productionize solutions
• Design and implement compiler passes to better schedule and utilize resources
• Design and implement Python interactions with runtime components
• Drive towards general solutions that benefit different all GPU targets and the overall community
Preferred Experience in following tools/flows
• Experience with GPU APIs (HIP, CUDA, Vulkan, DirectX, Metal)
• Understanding of GPU architectures
• Understanding of parallel/asynchronous programming
• Familiarity with operating system internals and resource management
• Understanding of game engine internals
• Experience with various system debugging/benchmarking/profiling tools
• Strong C/C++ understanding and skills
• Familiarity with IREE, MLIR, LLVM, SPIR-V or other compiler technologies
• Open-source development ethos
Preferred Academic Credentials
BS/MS (Computer Science, Computer Engineering, Electrical Engineering, or related equivalent)
Location:
San Jose, CA, USA
#LI-G11
Job Tags