Site Reliability Engineer (Fintech) Job at Inabia Software & Consulting Inc., Frisco, TX

ZHRvcWM0SXFad2R0bkJSQ2l4UkhrUVhGQnc9PQ==
  • Inabia Software & Consulting Inc.
  • Frisco, TX

Job Description

Client looking for 10+ Years of experience.

Title: Site Reliability Engineer (Fintech)
Duration: Contract
Location: Bellevue, WA , Frisco, TX , Atlanta, GA , Overland Park, KS (Hybrid)

Locals Required
Except OPT and H1T, All Work Authorization is Workable
While sharing the resume, Please do mention the candidate location and Work Authorization.

Job Description :
Must Have Skills –
Skill 1 – 8 Yrs of Exp – Kubernetes , AWS Cloud,
Skill 2 – 8 Yrs of Exp – Reliability Engineering,

We are looking for an experienced Platform Site Reliability Engineer (SRE) with deep expertise in Kubernetes and AWS to help us enhance the performance, scalability, and reliability of Digital Payment platform. You will play a critical role in ensuring the availability and resilience of our cloud-native services, with a focus on automation, monitoring, and performance optimization.

Job Description:
As a Platform SRE with a focus on Kubernetes and AWS, you will work with cross-functional teams to design, implement, and maintain scalable, secure, and high-performing infrastructure. You will be responsible for managing Kubernetes clusters, automating infrastructure deployments, and leveraging AWS services to ensure platform reliability, availability, and continuous improvement.

Key Responsibilities:
•    Kubernetes Management: Deploy, manage, and optimize Kubernetes clusters in production and staging environments, ensuring high availability and efficient resource utilization.
•    AWS Infrastructure: Leverage AWS cloud services (EC2, S3, RDS, EKS, Lambda, etc.) to build, manage, and scale cloud-native infrastructure.
•    Automation & Infrastructure as Code: Develop and maintain automated workflows using Infrastructure as Code (IaC) tools like Terraform, CloudFormation, or Ansible to provision, configure, and manage cloud infrastructure.CI/CD Pipeline Support: Build, optimize, and maintain CI/CD pipelines to enable seamless code delivery and deployments, using tools like Jenkins, GitLab CI, or CircleCI.
•    Monitoring & Observability: Implement and maintain monitoring, alerting, and logging solutions using tools such as Prometheus, Grafana, CloudWatch, or ELK stack to ensure system health and availability.
•    Incident Response: Lead and support incident response efforts, conduct root cause analysis, and implement post-incident reviews to improve system resilience.
•    Performance Optimization: Identify and resolve performance bottlenecks, improve system efficiency, and ensure applications and infrastructure are optimized for both cost and performance.
•    Security & Compliance: Work with security teams to implement best practices for securing Kubernetes clusters, AWS resources, and platform infrastructure, including access controls, network policies, and encryption.
•    Collaboration & Documentation: Work closely with development, DevOps, and infrastructure teams to align on best practices, improve automation, and document procedures for infrastructure management and troubleshooting.

Required Qualifications:
•    Experience: 3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role with hands-on experience in cloud-native infrastructure.
•    Kubernetes Expertise: Strong expertise in managing and scaling Kubernetes clusters, including experience with Kubernetes networking, storage, and multi-cluster architectures.
•    AWS Cloud Expertise: Proficiency with AWS services such as EC2, S3, EKS, RDS, VPC, Lambda, IAM, CloudWatch, and others. Experience with AWS best practices for scalability, security, and cost management.
•    Infrastructure as Code (IaC): Hands-on experience with IaC tools such as Terraform, AWS CloudFormation, or Ansible for provisioning and managing cloud infrastructure.CI/CD Pipelines: Experience building and maintaining continuous integration and continuous deployment (CI/CD) pipelines using Jenkins, GitLab CI, or similar tools.
•    Scripting & Automation: Proficiency in scripting languages such as Python, Bash, or Go to automate operational tasks and improve workflows.
•    Monitoring & Logging: Experience with monitoring, logging, and alerting tools like Prometheus, Grafana, CloudWatch, ELK stack, or similar tools.
•    Troubleshooting & Incident Management: Ability to troubleshoot complex issues in distributed systems, conduct root cause analysis, and implement solutions to prevent recurrence.
•    Collaboration Skills: Strong communication skills with the ability to work collaboratively with developers, operations, and product team

Job Tags

Contract work, Local area,

Similar Jobs

Atlantic Heating & Cooling Service

Class B CDL Driver Job at Atlantic Heating & Cooling Service

 ...We are a high paced company seeking a dependable and efficient CDL driver who possesses a great deal of physical and mental stamina able to withstand long hours on the road. Must be organized, detailed oriented, and comfortable working independently. A clean driving record... 

Piper Sandler

Investment Banking Associate - Public Finance Job at Piper Sandler

 ...two decades. The SDG team is passionate about the quality of banking we bring to the market and the unparalleled experience we offer...  ...willing to relocate to Houston. Piper Sandler is a leading investment bank, enabling growth and success for our clients through deep... 

Beth Israel Lahey Health

Registered Nurse (RN) - Oncology IV Infusion - 32 hours/week, day shift, cross campus Job at Beth Israel Lahey Health

 ...you join the growing BILH team, you're not just taking a job, you're making a difference in people's lives.**Registered Nurse (RN) - IV Infusion cross campus BH and AGH, 8hour shifts**Job Description:**Job DescriptionThis position is cross campus to Beverly Hospital... 

TTK Enterprises

Roustabout Laborer Job at TTK Enterprises

 ...team members to learn the ropes and develop valuable skills in the oil and gas industry. Supervisory Responsibilities: While you...  ...Responsibilities:~ Site Preparation: Assist in setting up drilling rigs, well sites, and production facilities by moving and positioning... 

Leidos

Oracle APEX Specialist Job at Leidos

 ...Citizen. Capability to obtain and maintain a public trust security clearance. Familiarity with secure network programming in an Oracle environment, including the use of security certificates and callouts. Experience with full lifecycle development of custom...