This job board retrieves part of its jobs from: Toronto Jobs | Emplois Montréal | IT Jobs Canada

Find jobs in Arkansas today!

To post a job, login or create an account |  Post a Job

  Jobs in Arkansas  

Bringing the best, highest paying job offers near you

previous arrow
next arrow

Walmart: Senior Software Engineer – Site Reliability


This is a Full-time position in Tontitown, AR posted May 1, 2021.

Position Summary…What you’ll do…As a member of the SRE team you will work with other SRE and DevOps practitioners to produce mission-critical infrastructure, tools, and processes that will ensure highest levels of availability and reliability of all our websites, systems and services.

As a senior member of the team you will be expected to work with management, peers, and customers to define and implement the technical vision of the team.

You’re right for the job if you’re comfortable with deep technical Linux, networking topics, and distributed architectures.

You will work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering service or solution that we deliver to our stakeholders.

You’ll excel if you have enthusiasm for digging deep, and a flare for sharp technical communication, prioritization and organization.

You will work directly with our Software Engineering teams to build our next generation “always up” cloud based e-commerce/Retail and Enterprise platform.

Site Reliability Engineers are hybrid systems and software engineers who are responsible and take ownership for reliability, scalability, automation, and other issues related to uptime and availability of Walmart’s e-commerce/Retail and Enterprise platform.

Our goal is to build, scale and guard the systems that delights the customers.

To do so, you will need to strong skills in following areas:Design, write and build tools to improve the reliability, latency, availability and scalability of Walmart e-commerce/Retail and Enterprise products.Engender reliability and availability starting with metrics and measurementsEnable scaling by providing tools, developing training and/or augmenting processesBuild tools/automate to prevent re-occurrence of problem to mission critical products/services.Augment existing instrumentation to build a cohesive picture of the characteristics of our systems with special attention to points of failure.Participate in capacity planning, demand forecasting, software performance analysis and system tuning.Develop a deep understanding of the various services and applications that come together to deliver Walmart e-commerce/Retail and Enterprise products.Design new tools to monitor and smart alerts that help discover failures/issues in a timely fashion and work with engineers to identify root cause and fix issuesInfluence, design and create new architectures, standards and methods for large-scale enterprise systems.Root-cause analysis complex problems involving multiple parties, networks, hardware and software that relate to scaling and performanceParticipate in on-call rotation.Secure the system from issues, be they real, perceived or notionalHigh focus on collecting and inferring metric documentation to be used by others to build and maintain systems.Scripting and Development responsibilitiesExperience with configuration management tools such as Ansible, Saltstack, Chef and PuppetBuild and drive the automation systems that maintain system healthEliminate Single Point of failure and test disaster recovery and HA regularly.Additional responsibilities may include:Drives standardization and service focused instrumentation.

Provides subject matter expertise.

Resolves break/fix scenarios, engaging broader teams as necessary; and partners/leads to achieve continuous improvement.

Contributes to command and control related activities focused on restoration of complex outages, and rapid restoration.

Participate on 24/7 on-call rotation.

May work independently or as part of a team on more complex projects.

Provides mentoring and guidance to more junior team members.Creates systems engineering and architectural velop software in several modern languages.

Develops large/complex database-backed systems and has an understanding of DB schema and query performance.

Utilizes professional best practices in day-to-day work like revision control, unit testing, or other.

Applies statistical data analysis techniques.Networking responsibilities: Understanding and performing TCP dumps, snoop, and other network sniffers.

Understands and applies knowledge of most protocols (TCP/IP, HTTP, UDP, etc.)Application Technologies: Provides recommendations and advice to the team and/or department in the areas of web services, OS, and storage, including being an active liaison to Development, QA and the Business.Analyzes systems and makes recommendations to prevent possible problems.

Takes lead on issue resolution activities using knowledge of complex and company-wide systems.Lead end-to-end audit of monitors and alarms based on subsystem knowledge.Utilizes time management and project management skills to lead the resolution of issues in a timely and organized manner, effectively communicating necessary information.

May consult directly with developers or third party vendors; provides subject matterexpertise.Consistent exercise of independent judgment and discretion in matters of significance.Other duties and responsibilities as assigned.

Qualifications:6+ years in a software development, DevOps role, or SRE role.Experience in designing, investigating, analyzing and troubleshooting large-scale enterprise systems.Methodical and systematic problem solving approach, combined with a solid awareness of ownership, initiative and drive.Fluency with running services at scale; In depth understanding of Unix systems internals and networking.Networking knowledge and in depth understanding of network concepts, such as different protocols (TCP/IP, UDP, ICMP, etc.), MAC addresses, IP packets, DNS, OSI layers, and load balancing).Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.

Experience administering Linux systems in a production environmentProgramming experience in one or more of the following languages: Go, Java, Python, Ruby, ShellBachelor’s Degree in Computer Science or a related field, or relevant work experienceExperience with distributed version control like Git or similarExperience with IaaS and PaaS providers such as AWS, AZURE, OpenStack, GCPExperience with containerization and container platforms.


Docker, Kubernetes, Docker EE, OpenShift, Mesosphere).Experience with enterprise monitoring solutions like Dynatrace, AppDynamics, New Relic, Prometheus, Graphite, Grafana, Nagios, Sensu and SplunkFamiliarity with continuous integration/deployment processes and tools such as Jenkins, Maven, Nexus, etc.,Minimum Qualifications…Outlined below are the required minimum qualifications for this position.

If none are listed, there are no minimum qualifications.

Bachelor’s degree in Computer Science and 3 years’ experience in software engineering or related field OR 5 years’ experience in softwareengineering or related field.Preferred Qualifications…Outlined below are the optional preferred qualifications for this position.

If none are listed, there are no preferred qualifications.

Master’s degree in Computer Science or related field and 2 years’ experience in software engineering or related fieldPrimary Location…805 SE MOBERLY LN, BENTONVILLE, AR 72712, United States of America

AL Jobs AR Jobs CA Jobs GA Jobs KS Jobs KY Jobs LA Jobs MD Jobs MI Jobs MN Jobs MS Jobs MO Jobs NY Jobs OR Jobs TN Jobs TX Jobs UT Jobs VA Jobs WV Jobs ID Jobs