Data Architect ( Big Data )

MetalSite   •  

Pittsburgh, PA

Industry: Technology


8 - 10 years

Posted 165 days ago

This job is no longer available.

Overview:  Management Science Associates, Inc. (MSA) is a diversified information management company that for nearly half a century has given market leaders the competitive edge in data management, analytics and technology.  We are seeking a highly technical and experienced Data Integration Architect to join the MSA Information Management Solutions (IMS) division to lead the building of our next-generation data and integration infrastructure delivering near real-time, ‘Big Data’ capable business intelligence and analytics products for the CPG industry.


The Big Data Architect is responsible for ensuring the data architecture, modeling and data governance conform to standards and business need.  In addition, the Big Data Architect leads data architecture and design, new technology implementation, research and evaluation of emerging data management technologies, strategic technology planning, consulting, software validation and e-business support functions and initiates data architecture related activities, including extract-transform-load (ETL), data quality, master data management (MDM), data integration processes, database and data warehouse solutions design, through business needs analysis and routine issue resolution.



  • Recommend, architect, design, help develop, operate and maintain a data and application integration infrastructure supporting large-scale analytics with; clear coherent data models, integration & ETL processes, replication schemes and query optimization techniques facilitating highly-varied (structured, semi-structured, and unstructured) and high-velocity (near real-time) data processing and delivery, and large dataset acquisition, analysis, storage, cleansing, transformation and reclamation as the design center for approaching instantaneous processing
  • Ensure the architecture is optimized for large dataset acquisition, analysis, storage, cleansing, transformation and reclamation
  • Develop standards and methodologies for the enterprise data warehouse and its infrastructure including: benchmarking; performance, evaluation and testing; data security and data privacy; meta-data and provenance management; governance, stewardship, data quality and lifecycle management; performing impact analysis; performance tuning and capacity planning; and facilitating the addition of new data source systems, integration business rules and logic as required by growth
  • Ensure architecture adherence to enterprise architecture best-practices that meet or exceed MSA and customer requirements for security, reliability, availability, manageability, performance, scalability, functionality, and flexibility and responsiveness
  • Establish and maintain effective data integration policies, procedures, standards, methodologies; metrics for data quality, metadata management, and master data management; and application integration policies, procedures, standards, methodologies, and metrics for EAI, SOA, and ESB technologies
  • Provide recommendations, technical direction and leadership for incorporation of new tactical technologies such as Hadoop and NoSQL as part of the overall integration architecture
  • Manage and review the enterprise data management and integration architecture implementation for financial cost/ROI analysis, impact on resources and platform capacity analysis, performance analysis, security features and vulnerabilities, costs against budgets, and continuous improvement of productivity
  • Participate in regular status meetings to track progress, resolve issues, mitigate risks and escalate concerns in a timely manner
  • Contribute to the development, review, and maintenance of product requirements documents, technical design documents, and functional specifications
  • Help architect and design innovative, customer-centric solutions based on deep knowledge of large-scale data and application integration technology and the CPG industry
  • Collaborate with Product and Project Management, Architecture, Testing, Quality Assurance, Operations, Client Services, and executive management to ensure the integration efforts meet release schedules, goals, and objectives

Required Skills

  • Four-year degree in Computer Science/Software Engineering or related degree program, or equivalent application development, implementation and operations experience. 
  • Advanced study or degrees such as  Master’s degree in Business (MBA), Masters, PhD., in Computer Science/Software Engineering or a related scientific degree program is preferred
  • Minimum eight years of related enterprise application and database development experience, including at least three years of experience as a Big Data Architect working with technologies such as Hadoop, Kafka, Spark, Hive, Redshift, and HBase and large data sets. 
  • Knowledge and implementation experience with data management techniques such as; Master Data Management (MDM) and Data Quality (DQ) management, Complex Event Processing (CEP) and metadata management, text analytics and mining, in-memory processing and hardware acceleration options, data profiling, and batch processing, micro-batches, streaming data loads, and Change Data Capture (CDC)
  • SQL
  • Hadoop, Spark, Hive, Redshift (or similar MPP data warehousing platforms)
  • Knowledge and implementation experience with the integration and aggregation of large data sets into high-volume data warehouse applications using highly-varied and high-velocity data collection systems
  • Knowledge of multiple vendors technology (Oracle, Microsoft, Amazon Redshift and Open Source Hadoop Distributions) supporting OLTP, Multi-dimensional or OLAP databases and accelerator tools and technologies implemented as delivery platforms for Business Intelligence, Data Warehousing and Data Analytics
  • General experience with data governance, stewardship, risk and compliance
  • Working knowledge of complex Event Processing (CEP), text analytics and mining, and in-memory processing
  • Demonstrated independent and successful judgment about methods, techniques and evaluation criteria for obtaining results
  • Working knowledge of operations workflow schedulers, workload management, application and system availability, scalability and distributed data platforms
  • Excellent communication skills, particularly those relating to complex findings and presenting them to ensure audience appeal at various levels of the organization
  • Ability to integrate research and best practices into problem avoidance and continuous improvement
  • Qualified candidates must be well-versed in data warehousing, business intelligence, and advanced analytics incorporating ‘Big Data’ contexts, and parallel processing technologies