shawhin's picture
Update README.md
4b82d32 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:809
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
  - source_sentence: >-
      Senior Data Analyst Pricing, B2B B2C Pricing Strategies, A/B Testing
      Analysis
    sentences:
      - |-
        Qualifications
         Data Engineering, Data Modeling, and ETL (Extract Transform Load) skillsData Warehousing and Data Analytics skillsExperience with data-related tools and technologiesStrong problem-solving and analytical skillsExcellent written and verbal communication skillsAbility to work independently and remotelyExperience with cloud platforms (e.g., AWS, Azure) is a plusBachelor's degree in Computer Science, Information Systems, or related field
      - >-
        Skills You BringBachelor’s or Master’s Degree in a technology related
        field (e.g. Engineering, Computer Science, etc.) required with 6+ years
        of experienceInformatica Power CenterGood experience with ETL
        technologiesSnaplogicStrong SQLProven data analysis skillsStrong data
        modeling skills doing either Dimensional or Data Vault modelsBasic AWS
        Experience Proven ability to deal with ambiguity and work in fast paced
        environmentExcellent interpersonal and communication skillsExcellent
        collaboration skills to work with multiple teams in the organization
      - >-
        experience, an annualized transactional volume of $140 billion in 2023,
        and approximately 3,200 employees located in 12+ countries, Paysafe
        connects businesses and consumers across 260 payment types in over 40
        currencies around the world. Delivered through an integrated platform,
        Paysafe solutions are geared toward mobile-initiated transactions,
        real-time analytics and the convergence between brick-and-mortar and
        online payments. Further information is available at www.paysafe.com.


        Are you ready to make an impact? Join our team that is inspired by a
        unified vision and propelled by passion.


        Position Summary


        We are looking for a dynamic and flexible, Senior Data Analyst, Pricing
        to support our global Sales and Product organizations with strategic
        planning, analysis, and commercial pricing efforts . As a Senior Data
        Analyst , you will be at the frontier of building our Pricing function
        to drive growth through data and AI-enabled capabilities. This
        opportunity is high visibility for someone hungry to drive the upward
        trajectory of our business and be able to contribute to their efforts in
        the role in our success.


        You will partner with Product Managers to understand their commercial
        needs, then prioritize and work with a cross-functional team to deliver
        pricing strategies and analytics-based solutions to solve and execute
        them. Business outcomes will include sustainable growth in both revenues
        and gross profit.


        This role is based in Jacksonville, Florida and offers a flexible hybrid
        work environment with 3 days in the office and 2 days working remote
        during the work week.


        Responsibilities

         Build data products that power the automation and effectiveness of our pricing function, driving better quality revenues from merchants and consumers.  Partner closely with pricing stakeholders (e.g., Product, Sales, Marketing) to turn raw data into actionable insights. Help ask the right questions and find the answers.  Dive into complex pricing and behavioral data sets, spot trends and make interpretations.  Utilize modelling and data-mining skills to find new insights and opportunities.  Turn findings into plans for new data products or visions for new merchant features.  Partner across merchant Product, Sales, Marketing, Development and Finance to build alignment, engagement and excitement for new products, features and initiatives.  Ensure data quality and integrity by following and enforcing data governance policies, including alignment on data language. 

          Qualifications  

         Bachelor’s degree in a related field of study (Computer Science, Statistics, Mathematics, Engineering, etc.) required.  5+ years of experience of in-depth data analysis role, required; preferably in pricing context with B2B & B2C in a digital environment.  Proven ability to visualize data intuitively, cleanly and clearly in order to make important insights simplified.  Experience across large and complex datasets, including customer behavior, and transactional data.  Advanced in SQL and in Python, preferred.  Experience structuring and analyzing A/B tests, elasticities and interdependencies, preferred.  Excellent communication and presentation skills, with the ability to explain complex data insights to non-technical audiences. 

         Life at Paysafe: 

        One network. One partnership. At Paysafe, this is not only our business
        model; this is our mindset when it comes to our team. Being a part of
        Paysafe means you’ll be one of over 3,200 members of a world-class team
        that drives our business to new heights every day and where we are
        committed to your personal and professional growth.


        Our culture values humility, high trust & autonomy, a desire for
        excellence and meeting commitments, strong team cohesion, a sense of
        urgency, a desire to learn, pragmatically pushing boundaries, and
        accomplishing goals that have a direct business impact.

         

        Paysafe provides equal employment opportunities to all employees, and
        applicants for employment, and prohibits discrimination of any type
        concerning ethnicity, religion, age, sex, national origin, disability
        status, sexual orientation, gender identity or expression, or any other
        protected characteristics. This policy applies to all terms and
        conditions of recruitment and employment. If you need any reasonable
        adjustments, please let us know. We will be happy to help and look
        forward to hearing from you.
  - source_sentence: ETL Pipelines, Apache Spark, AirFlow
    sentences:
      - >-
        Experience as a Product Data Analyst at TGG:Achieving business results
        as a client facing consultant for our clients in various types of
        engagements within a variety of industries.Delivering high quality work
        to our clients within our technology service line. Being part of a
        collaborative, values-based firm that has a reputation for great work
        and satisfied clients.Working with senior IT leaders to communicate
        strategic goals to their organization, including leading client and
        internal development teams on best practices.

        What You Will Work On:Analyze large datasets to identify patterns,
        trends, and opportunities for product optimization.Develop and maintain
        dashboards and reports to track key performance metrics.Collaborate with
        product managers, marketers, and engineers to ideate, prioritize, and
        implement data-driven initiatives.Conduct A/B testing and other
        statistical analyses to evaluate the effectiveness of product
        changes.Communicate findings and recommendations to stakeholders through
        clear and concise presentations.Contribute analytical insights to inform
        product vision and deliver value.

        Who Will You Work With:Client stakeholders ranging from individual
        contributors to senior executives.A collaborative team of consultants
        that deliver outstanding client service.TGG partners, principals,
        account leaders, managers, and staff supporting you to excel within
        client projects and to achieve your professional development goals.

        Examples of What You Bring to the Table:You have strong analysis
        capabilities and thrive on working collaboratively to deliver successful
        results for clients. You have experience with these
        technologies:Proficiency in SQL and Python for data extraction,
        manipulation, and analysis.Strong understanding of statistical concepts
        and techniques.Intermediate experience with Tableau, Power BI, Adobe
        Analytics, or similar BI tools.Ability to analyze requirements, design,
        implement, debug, and deploy Cloud Platform services and components.At
        least basic exposure to data science and machine learning
        methods.Familiarity with source control best practices: Define,
        Setup/Configure, Deploy and Maintain source code (e.g. GIT, VisualSafe
        Source).Ability to develop and schedule processes to extract, transform,
        and store data from these systems: SQL databases, Azure cloud services,
        Google cloud service, Snowflake.4-8 years of relevant
        experience.Bachelor’s degree in Computer Science, Statistics, Economics,
        Mathematics, or a related field; or equivalent combination of education,
        training, and experience.Analytical Product Mindset: Ability to approach
        problems analytically and derive actionable insights from complex
        datasets, while remaining focused on providing value to customers
        Strategic Thinking: Demonstrated ability to translate data findings into
        strategic, achievable recommendations to drive business
        outcomes.Communication Skills: Excellent verbal and written
        communication skills.Ability to effectively convey technical concepts
        from technical to non-technical stakeholders and vice-versa.Team Player:
        Proven track record of collaborating effectively with cross-functional
        teams in a fast-paced environment.Adaptability: Have consistently
        demonstrated the ability to bring structure to complex, unstructured
        environments.Familiarity with Agile development methodologies.Ability to
        adapt to changing priorities to thrive in dynamic work environments.

        Salary and Benefits:Nothing is more important to us than the well-being
        of our team. That is why we are proud to offer a full suite of
        competitive health benefits along with additional benefits such as:
        flexible PTO, a professional development stipend and work from home
        stipend, volunteer opportunities, and team social activities.

        Salaries vary and are dependent on considerations such as: experience
        and specific skills/certifications. The base plus target bonus total
        compensation range for this role is $95,000 - $125,000. Additional
        compensation beyond this range is available as a result of leadership
        and business development opportunities. Salary details are discussed
        openly during the hiring process. 

        Work Environment:TGG is headquartered in Portland, Oregon, and has team
        members living in various locations across the United States. Our
        consultants must have the ability to travel and to work remotely or
        onsite. Each engagement has unique conditions, and we work
        collaboratively to meet both our client and team's needs regarding
        onsite and travel requirements. 

        Why The Gunter Group:TGG was created to be different, to be relational,
        to be insightful, and to maximize potential for our consultants, our
        clients, and our community. We listen first so we can learn, analyze,
        and deliver meaningful solutions for our clients. Our compass points
        towards our people and our “Non-Negotiables” always. Our driven
        employees make us who we are  a talented team of leaders with deep and
        diverse professional experience.If you think this role is the right fit,
        please submit your resume and cover letter so we can learn more about
        you. 

        The Gunter Group LLC is
      - >-
        Requirements & Day-to-Day:  Design, develop, and support scalable data
        processing pipelines using Apache Spark and Java/Scala. Lead a talented
        team and make a significant impact on our data engineering capabilities.
        Implement and manage workflow orchestration with AirFlow for efficient
        data processing. Proficiently use SQL for querying and data manipulation
        tasks. Collaborate with cross-functional teams to gather requirements
        and ensure alignment with data engineering solutions.  Essential
        Criteria:  a bachelor’s degree in computer science or another relevant
        discipline, and a minimum of five years of relevant experience in data
        engineering. Solid experience with Apache Spark for large-scale data
        processing. Proficiency in Java or Scala programming languages. Strong
        knowledge of AirFlow for workflow orchestration. Proficient in SQL for
        data querying and manipulation.
      - >-
        experienced Data Engineer to join our world leading footwear client. The
        ideal candidate will have 6-7 years of relevant experience, with a focus
        on practical application in AWS tech stack. Experience with Databricks,
        Spark, and Python for coding is essential.


        W2 ONLY, NO C2C*



        Key Qualifications:


        Bachelor’s degree in Computer Science or related field.6-7 years of data
        engineering experience.Proficiency in AWS, Databricks, Spark, and
        Python.Ability to work in complex environments with diverse
        projects.Strong communication and collaboration skills.



        Mainz Brady Group is a technology staffing firm with offices in
        California, Oregon and Washington. We specialize in Information
        Technology and Engineering placements on a Contract, Contract-to-hire
        and Direct Hire basis. Mainz Brady Group is the recipient of multiple
        annual Excellence Awards from the Techserve Alliance, the leading
        association for IT and engineering staffing firms in the U.S.


        Mainz Brady Group is
  - source_sentence: >-
      dialog evaluation processes, speech interaction analysis, data collection
      conventions
    sentences:
      - >-
        experience with the Refactor the Macro code from local Python/R
        implementation to Databricks (Python/Pyspark) Analytical expert who
        utilize his/her skills in both technology and social science to find
        trends and manage data.They use industry knowledge, contextual
        understanding, skepticism of existing assumptions – to uncover solutions
        to business challengesCollecting, analysis and clean up dataCreating
        algorithms for processing catalog products using different data
        sourcesExperimenting with different models and neural networks, creating
        model ensemblesCreating a workflow for publishing algorithms to
        productionStrong skills in a machine and/or deep learning algorithms,
        data cleaning, feature extraction, and generationDemonstrated
        computational skills and experience with PythonExperience executing and
        presenting independent analysis Must have skills:Python(Programming
        Language)R (Programming Language)PySparkDatabricks
      - >-
        requirements. You will work closely with cross-functional teams to
        develop and implement data processing solutions that align with business
        needs. Additionally, you will be responsible for ensuring the quality
        and integrity of data while optimizing performance and ensuring data
        security. The successful candidate must have at least 5 years of
        experience in data engineering, with a strong focus on Azure Databricks
        and Azure Data Factory. You should be able to design and develop
        efficient data processing pipelines and should be proficient in SQL
        queries. Experience in JIRA is a must. Must Have below skills:• SQL
        Quires • SSIS• Data Factory• Databricks• JIRA.


        Thanks & RegardsJoshuaDelivery Manager
      - >-
        experience with speech interfaces Lead and evaluate changing dialog
        evaluation conventions, test tooling developments, and pilot processes
        to support expansion to new data areas Continuously evaluate workflow
        tools and processes and offer solutions to ensure they are efficient,
        high quality, and scalable Provide expert support for a large and
        growing team of data analysts Provide support for ongoing and new data
        collection efforts as a subject matter expert on conventions and use of
        the data Conduct research studies to understand speech and
        customer-Alexa interactions Assist scientists, program and product
        managers, and other stakeholders in defining and validating customer
        experience metrics


        We are open to hiring candidates to work out of one of the following
        locations:


        Boston, MA, USA | Seattle, WA, USA


        Basic Qualifications

         3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience 2+ years of data scientist experience Bachelor's degree Experience applying theoretical models in an applied environment

        Preferred Qualifications

         Experience in Python, Perl, or another scripting language Experience in a ML or data scientist role with a large technology company Master's degree in a quantitative field such as statistics, mathematics, data science, business analytics, economics, finance, engineering, or computer science

        Amazon is committed to a diverse and inclusive workplace. Amazon is 


        Our compensation reflects the cost of labor across several US geographic
        markets. The base pay for this position ranges from $111,600/year in our
        lowest geographic market up to $212,800/year in our highest geographic
        market. Pay is based on a number of factors including market location
        and may vary depending on job-related knowledge, skills, and experience.
        Amazon is a total compensation company. Dependent on the position
        offered, equity, sign-on payments, and other forms of compensation may
        be provided as part of a total compensation package, in addition to a
        full range of medical, financial, and/or other benefits. For more
        information, please visit
        https://www.aboutamazon.com/workplace/employee-benefits. This position
        will remain posted until filled. Applicants should apply via our
        internal or external career site.



        Company - Amazon.com Services LLC


        Job ID: A2610750
  - source_sentence: >-
      Senior Data Analyst with expertise in Power BI, data governance, and
      statistical analysis
    sentences:
      - >-
        qualifications and experience.

        RESPONSIBILITIESData Analysis and Insights: Utilize advanced data
        analysis techniques to extract insights from large datasets, identify
        trends, patterns, and correlations, and translate findings into
        actionable recommendations for business stakeholders. Develop predictive
        models, algorithms, and data visualization tools to support
        decision-making processes, optimize business performance, and drive
        strategic initiatives.Strategy Development: Collaborate with senior
        leadership and key stakeholders to develop data-driven strategies and
        roadmaps that align with business objectives and drive innovation across
        the organization. Conduct market research, competitive analysis, and
        industry benchmarking to identify opportunities for growth,
        differentiation, and competitive advantage.Technology Engineering:
        Design, develop, and implement technology solutions and platforms to
        support data analytics, reporting, and automation initiatives,
        leveraging tools and technologies such as SQL, Python, R, Tableau, Power
        BI, and cloud-based platforms. Architect and maintain data
        infrastructure, databases, and systems to ensure scalability,
        reliability, and security of data assets.Cross-Functional Collaboration:
        Partner with cross-functional teams, including IT, Marketing,
        Operations, and Finance, to gather requirements, define solution
        specifications, and ensure successful implementation and adoption of
        data-driven initiatives. Provide technical guidance, training, and
        support to stakeholders to enable self-service analytics and empower
        data-driven decision-making throughout the organization.Performance
        Monitoring and Optimization: Monitor and analyze the performance of data
        analytics solutions and technology platforms, identifying opportunities
        for optimization, scalability, and continuous improvement. Implement
        best practices, standards, and governance frameworks to ensure data
        integrity, privacy, and compliance with regulatory requirements.

        REQUIREMENTSOccasionally lift and/or move up to 25 lbs. Ability to
        understand and follow instructions in English.Ability to sit for
        extended periods of time, twist, bend, sit, walk use hands to twist,
        handle or feel objects, tools or controls, such as computer mouse,
        computer keyboard, calculator, stapler, telephone, staple puller, etc.,
        reach with hands and arms, balance, stoop, kneel, talk or hear.Specific
        vision abilities required by the job include close vision, distance
        vision, peripheral vision, depth perception and the ability to adjust
        focus.

        QUALIFICATIONSBachelor's degree in Computer Science, Data Science,
        Information Systems, or related field; Master's degree or relevant
        certification preferred.X years of experience in data analysis, strategy
        development, and technology engineering roles, preferably in the
        financial services or banking industry.Strong proficiency in data
        analysis tools and programming languages, such as SQL, Python, R, and
        experience with data visualization tools such as Tableau or Power
        BI.Solid understanding of data modeling, database design, and data
        warehousing principles, with experience working with relational
        databases and cloud-based platforms.Proven track record of developing
        and implementing data-driven strategies and technology solutions that
        drive business value and operational efficiency.Excellent communication,
        problem-solving, and stakeholder management skills.Ability to work
        independently as well as collaboratively in a fast-paced, dynamic
        environment. Strong analytical mindset, attention to detail, and a
        passion for leveraging data and technology to solve complex business
        challenges.

        ABOUT STEARNS BANKStearns Bank is a leading financial institution
        dedicated to leveraging cutting-edge technology and data analytics to
        provide innovative banking solutions. With a commitment to excellence
        and continuous improvement, Stearns Bank offers a dynamic and
        collaborative work environment for professionals seeking to make a
        significant impact in the finance and technology sectors.

        WHY JOIN STEARNS BANK?Opportunity to work at the intersection of
        finance, technology, and data analytics, driving innovation and shaping
        the future of banking. Collaborative and inclusive work culture that
        values diversity, creativity, and continuous learning. Competitive
        compensation package with comprehensive benefits and opportunities for
        professional development and advancement. Make a meaningful impact by
        leveraging your expertise to drive data-driven decision-making and
        technology innovation, contributing to the success and growth of Stearns
        Bank.Note: The above job description is intended to outline the general
        nature and level of work being performed by individuals assigned to this
        position. It is not intended to be construed as an exhaustive list of
        responsibilities, duties, and skills required. Management reserves the
        right to modify, add, or remove duties as necessary to meet business
        needs.

        EQUAL OPPORTUNITY EMPLOYER /AFFIRMATIVE ACTION PLANWe are
      - >-
        experience with kubernetes operating knowledge.Working with data
        pipelines and experience with Spark and FlinkExcellent communication
        skillsNice to have:Programming experience in Scala, Java, and
        PythonKnowledge on Machine Learning (Client)

        Job description:The client seeks to improve products by using data as
        the voice of our customers. We are looking for engineers to collaborate
        with users of our infrastructure and architect new pipelines to improve
        the user onboarding experience. As part of this group, you will work
        with petabytes of data daily using diverse technologies like Spark,
        Flink, Kafka, Hadoop, and others. You will be expected to effectively
        partner with upstream engineering teams and downstream analytical &
        product consumers. Experience:10+ YOE, with 5+ years of experience
        designing and implementing batch or real-time data pipelinesHands-on
        experience on batch processing (Spark, Presto, Hive) or streaming
        (Flink, Beam, Spark Streaming)Experience in AWS and knowledge in its
        ecosystem. Experience in scaling and operating kubernetes.Excellent
        communication skills is a must, experience working with customers
        directly to explain how they would use the infrastructure to build
        complex data pipelinesProven ability to work in an agile environment,
        flexible to adapt to changesAble to work independently, research on
        possible solutions to unblock customerProgramming experience in Scala,
        Java, or PythonFast learner and experience with other common big data
        open source technologies is a big plusKnowledge on machine learning
        (Client) is a nice-to-have
      - >-
        skills and proficiency/expertise in analytical tools including PowerBI
        development, Python, coding, Excel, SQL, SOQL, Jira, and others.Must be
        detail oriented, focused on excellent quality deliverables and able to
        analyze data quickly using multiple tools and strategies including
        creating advanced algorithms.Position serves as a critical member of
        data integrity team within digital solutions group and supplies detailed
        analysis on key data elements that flow between systems to help design
        governance and master data management strategies and ensure data
        cleanliness.

        Requirements:5 to 8 years related experience preferred. Bachelor's
        degree preferred.Power BIPythonSQL/SOQLJiraExcel
  - source_sentence: data integrity governance PowerBI development Juno Beach
    sentences:
      - >-
        skills: 2-5 y of exp with data analysis/ data integrity/ data
        governance; PowerBI development; Python; SQL, SOQL


        Location: Juno Beach, FL

        PLEASE SEND LOCAL CANDIDATES ONLY


        Seniority on the skill/s required on this requirement: Mid.


        Earliest Start Date: ASAP


        Type: Temporary Project


        Estimated Duration: 12 months with possible extension(s)


        Additional information: The candidate should be able to provide an ID if
        the interview is requested. The candidate interviewing must be the same
        individual who will be assigned to work with our client. 

        Requirements:• Availability to work 100% at the Client’s site in Juno
        Beach, FL (required);• Experience in data analysis/ data integrity/ data
        governance;• Experience in analytical tools including PowerBI
        development, Python, coding, Excel, SQL, SOQL, Jira, and others.


        Responsibilities include but are not limited to the following:• Analyze
        data quickly using multiple tools and strategies including creating
        advanced algorithms;• Serve as a critical member of data integrity team
        within digital solutions group and supplies detailed analysis on key
        data elements that flow between systems to help design governance and
        master data management strategies and ensure data cleanliness.
      - >-
        skills and expertise, experience and other relevant factors (salary may
        be adjusted based on geographic location)

         What does it mean to work at Armstrong?

        It means being immersed in a supportive culture that recognizes you as a
        key player in Armstrong's future. We are a large company with a local
        feel, where you will get to know and collaborate with leadership and
        your colleagues across the company.


        By joining us, you'll have the opportunity to make the most of your
        potential. Alongside a competitive remuneration package, you will
        receive:


        A benefits package including: medical, dental, prescription drug, life
        insurance, 401k match, long-term disability coverage, vacation and sick
        time, product discount programs and many more.Personal development to
        grow your career with us based on your strengths and interests.A working
        culture that balances individual achievement with teamwork and
        collaboration. We draw on each other's strengths and allow for different
        work styles to build engagement and satisfaction to deliver results. 



        As a Data Scientist, you will leverage cutting-edge generative AI
        techniques to extract structured data from diverse document types. From
        there, you will build models that understand context, domain-specific
        jargon and generate documents. The output of your work will enable
        long-term strategic advantages for the company.


        Essential Duties and Responsibilities include the following. Other
        duties may be assigned.


        Building AI/ML features to evaluate document quality, account loyalty,
        market trends, etc.Constructing supervised learning datasetsWriting
        robust and testable codeDefining and overseeing regular updates to
        improve precision as the company’s challenges and data evolveCultivating
        strong collaborations with teammates and stakeholdersSharing technical
        solutions and product ideas with the team through design/code reviews
        and weekly meetings



        Qualifications


        To perform this job successfully, an individual must be able to perform
        each essential duty satisfactorily. The requirements listed below are
        representative of the knowledge, skill, and/or ability required.
        Reasonable accommodations may be made to enable individuals with
        disabilities to perform the essential functions.


        Experience transforming natural language data into useful features using
        NLP techniques to feed classification algorithmsAbility to work with
        dashboarding and visualization software such as Tableau or Power
        BIKnowledge of software versioning control repositories such as
        GitHubAbility to translate data insights into actionable items and
        communicate findings in a simplistic wayExperience with generative AI
        would be a plus Enthusiasm for learning new things and going deep into
        detailed data analysisWorkflow flexibility, team player, and strong
        collaboration skills



        Education And/or Experience


        BS in Computer Science, Statistics or Applied Mathematics or equivalent
        years of experience2+ years in software development, statistical
        modeling, and machine learning2+ years of experience in an analytical
        field using tools such as Python, R, SAS, MatlabFamiliarity with SQL or
        other querying languages is preferred



        Why should you join Armstrong World Industries?


        Armstrong World Industries (AWI) is a leader in the design and
        manufacture of innovative commercial and residential ceiling, wall and
        suspension system solutions in the Americas. With approximately $1B in
        revenue, AWI has about 2,800 employees and a manufacturing network of
        fifteen facilities in North America.


        At home, at work, in healthcare facilities, classrooms, stores, or
        restaurants, we offer interior solutions that help to enhance comfort,
        save time, improve building efficiency and overall performance, and
        create beautiful spaces.


        For more than 150 years, we have built our business on trust and
        integrity. It set us apart then, and it sets us apart now, along with
        our ability to collaborate with and innovate for the people we're here
        to serve - our customers, our shareholders, our communities and our
        employees.


        We are committed to developing new and sustainable ceiling solutions,
        with design and performance possibilities that make a positive
        difference in spaces where we live, work, learn, heal and play. It's an
        exciting, rewarding business to be in, and we're committed to continue
        to grow and prosper for the benefit of all of our stakeholders. We hope
        you join us.


        Our Sustainability Ambition


        "Bringing our Purpose to Life" - lead a transformation in the design and
        building of spaces fit for today and tomorrow.


        We are committed to:


        Engaging a diverse, purpose-driven workforce;Transforming buildings from
        structures that shelter into structures that serve and preserve the
        health and well-being of people and planet;Pursuing sustainable,
        innovative solutions for spaces where we live, work, learn heal and
        play;Being a catalyst for change with all of our stakeholders; andMaking
        a positive difference in the environments and communities we impact.



        Armstrong is committed to engaging a diverse, purpose-driven workforce.
        As part of our dedication to diversity, AWI is committed to 


        Come and build your future with us and apply today!
      - >-
        QualificationsAdvanced degree (MS with 5+ years of industry experience,
        or Ph.D.) in Computer Science, Data Science, Statistics, or a related
        field, with an emphasis on AI and machine learning.Proficiency in Python
        and deep learning libraries, notably PyTorch and Hugging Face, Lightning
        AI, evidenced by a history of deploying AI models.In-depth knowledge of
        the latest trends and techniques in AI, particularly in multivariate
        time-series prediction for financial applications.Exceptional
        communication skills, capable of effectively conveying complex technical
        ideas to diverse audiences.Self-motivated, with a collaborative and
        solution-oriented approach to problem-solving, comfortable working both
        independently and as part of a collaborative team.


        CompensationThis role is compensated with equity until the product
        expansion and securing of Series A investment. Cash-based compensation
        will be determined after the revenue generation has been started. As we
        grow, we'll introduce additional benefits, including performance
        bonuses, comprehensive health insurance, and professional development
        opportunities. 

        Why Join BoldPine?

        Influence the direction of financial market forecasting, contributing to
        groundbreaking predictive models.Thrive in an innovative culture that
        values continuous improvement and professional growth, keeping you at
        the cutting edge of technology.Collaborate with a dedicated team,
        including another technical expert, setting new benchmarks in AI-driven
        financial forecasting in a diverse and inclusive environment.

        How to Apply

        To join a team that's redefining financial forecasting, submit your
        application, including a resume and a cover letter. At BoldPine, we're
        committed to creating a diverse and inclusive work environment and
        encouraging applications from all backgrounds. Join us, and play a part
        in our mission to transform financial predictions.
datasets:
  - shawhin/ai-job-embedding-finetuning
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: ai job validation
          type: ai-job-validation
        metrics:
          - type: cosine_accuracy
            value: 0.9900990099009901
            name: Cosine Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: ai job test
          type: ai-job-test
        metrics:
          - type: cosine_accuracy
            value: 1
            name: Cosine Accuracy

SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1 on the ai-job-embedding-finetuning dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Resulting model from example code for fine-tuning an embedding model for AI job search.

Links

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("shawhin/distilroberta-ai-job-embeddings")
# Run inference
sentences = [
    'data integrity governance PowerBI development Juno Beach',
    'skills: 2-5 y of exp with data analysis/ data integrity/ data governance; PowerBI development; Python; SQL, SOQL\n\nLocation: Juno Beach, FL\nPLEASE SEND LOCAL CANDIDATES ONLY\n\nSeniority on the skill/s required on this requirement: Mid.\n\nEarliest Start Date: ASAP\n\nType: Temporary Project\n\nEstimated Duration: 12 months with possible extension(s)\n\nAdditional information: The candidate should be able to provide an ID if the interview is requested. The candidate interviewing must be the same individual who will be assigned to work with our client. \nRequirements:• Availability to work 100% at the Client’s site in Juno Beach, FL (required);• Experience in data analysis/ data integrity/ data governance;• Experience in analytical tools including PowerBI development, Python, coding, Excel, SQL, SOQL, Jira, and others.\n\nResponsibilities include but are not limited to the following:• Analyze data quickly using multiple tools and strategies including creating advanced algorithms;• Serve as a critical member of data integrity team within digital solutions group and supplies detailed analysis on key data elements that flow between systems to help design governance and master data management strategies and ensure data cleanliness.',
    "QualificationsAdvanced degree (MS with 5+ years of industry experience, or Ph.D.) in Computer Science, Data Science, Statistics, or a related field, with an emphasis on AI and machine learning.Proficiency in Python and deep learning libraries, notably PyTorch and Hugging Face, Lightning AI, evidenced by a history of deploying AI models.In-depth knowledge of the latest trends and techniques in AI, particularly in multivariate time-series prediction for financial applications.Exceptional communication skills, capable of effectively conveying complex technical ideas to diverse audiences.Self-motivated, with a collaborative and solution-oriented approach to problem-solving, comfortable working both independently and as part of a collaborative team.\n\nCompensationThis role is compensated with equity until the product expansion and securing of Series A investment. Cash-based compensation will be determined after the revenue generation has been started. As we grow, we'll introduce additional benefits, including performance bonuses, comprehensive health insurance, and professional development opportunities. \nWhy Join BoldPine?\nInfluence the direction of financial market forecasting, contributing to groundbreaking predictive models.Thrive in an innovative culture that values continuous improvement and professional growth, keeping you at the cutting edge of technology.Collaborate with a dedicated team, including another technical expert, setting new benchmarks in AI-driven financial forecasting in a diverse and inclusive environment.\nHow to Apply\nTo join a team that's redefining financial forecasting, submit your application, including a resume and a cover letter. At BoldPine, we're committed to creating a diverse and inclusive work environment and encouraging applications from all backgrounds. Join us, and play a part in our mission to transform financial predictions.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric ai-job-validation ai-job-test
cosine_accuracy 0.9901 1.0

Training Details

Training Dataset

ai-job-embedding-finetuning

  • Dataset: ai-job-embedding-finetuning at c86ac36
  • Size: 809 training samples
  • Columns: query, job_description_pos, and job_description_neg
  • Approximate statistics based on the first 809 samples:
    query job_description_pos job_description_neg
    type string string string
    details
    • min: 8 tokens
    • mean: 15.02 tokens
    • max: 40 tokens
    • min: 7 tokens
    • mean: 348.14 tokens
    • max: 512 tokens
    • min: 7 tokens
    • mean: 351.24 tokens
    • max: 512 tokens
  • Samples:
    query job_description_pos job_description_neg
    Data engineering Azure cloud Apache Spark Kafka Skills:Proven experience in data engineering and workflow development.Strong knowledge of Azure cloud services.Proficiency in Apache Spark and Apache Kafka.Excellent programming skills in Python/Java.Hands-on experience with Azure Synapse, DataBricks, and Azure Data Factory.
    Nice To Have Skills:Experience with BI Tools such as Tableau or Power BI.Familiarity with Terraform for infrastructure as code.Knowledge of Git Actions for CI/CD pipelines.Understanding of database design and architecting principles.Strong communication skills and ability to manage technical projects effectively.
    requirements, and assist in data structure implementation planning for innovative data visualization, predictive modeling, and advanced analytics solutions.* Unfortunately, we cannot accommodate Visa Sponsorship for this role at this time.
    ESSENTIAL JOB FUNCTIONS
    Mine data covering a wide range of information from customer profile to transaction details to solve risk problems that involve classification, clustering, pattern analysis, sampling and simulations.Apply strong data science expertise and systems analysis methodology to help guide solution analysis, working closely with both business and technical teams, with consideration of both technical and non-technical implications and trade-offs.Carry out independent research and innovation in new content, ML, and technological domains. Trouble shooting any data, system and flow challenges while maintaining clearly defined strategy execution.Extract data from various data sources; perform exploratory data analysis, cleanse, transform, a...
    Databricks, Medallion architecture, ETL processes experience with Databricks, PySpark, SQL, Spark clusters, and Jupyter Notebooks.- Expertise in building data lakes using the Medallion architecture and working with delta tables in the delta file format.- Familiarity with CI/CD pipelines and Agile methodologies, ensuring efficient and collaborative development practices.- Strong understanding of ETL processes, data modeling, and data warehousing principles.- Experience with data visualization tools like Power BI is a plus.- Knowledge of cybersecurity data, particularly vulnerability scan data, is preferred.- Bachelor's or Master's degree in Computer Science, Information Systems, or a related field.
    requirements and deliver effective solutions aligned with Medallion architecture principles.- Ensure data quality and implement robust data governance standards, leveraging the scalability and efficiency offered by the Medallion architecture.- Design and implement ETL processes, including data cleansing, transformation, and integration, opti...
    experience with a minimum of 0+ years of experience in a Computer Science or Data Management related fieldTrack record of implementing software engineering best practices for multiple use cases.Experience of automation of the entire machine learning model lifecycle.Experience with optimization of distributed training of machine learning models.Use of Kubernetes and implementation of machine learning tools in that context.Experience partnering and/or collaborating with teams that have different competences.The role holder will possess a blend of design skills needed for Agile data development projects.Proficiency or passion for learning, in data engineer techniques and testing methodologies and Postgraduate degree in data related field of study will also help.


    Desirable for the role


    Experience with DevOps or DataOps concepts, preferably hands-on experience implementing continuous integration or highly automated end-to-end environments.Interest in machine learning will also be advan...
    Gas Processing, AI Strategy Development, Plant Optimization experience in AI applications for the Hydrocarbon Processing & Control Industry, specifically, in the Gas Processing and Liquefaction business. Key ResponsibilitiesYou will be required to perform the following:- Lead the development and implementation of AI strategies & roadmaps for optimizing gas operations and business functions- Collaborate with cross-functional teams to identify AI use cases to transform gas operations and business functions (AI Mapping)- Design, develop, and implement AI models and algorithms that solve complex problems- Implement Gen AI use cases to enhance natural gas operations and optimize the Gas business functions- Design and implement AI-enabled plant optimizers for efficiency and reliability- Integrate AI models into existing systems and applications- Troubleshoot and resolve technical issues related to AI models and deployments- Ensure compliance with data privacy and security regulations- Stay up-to-date with the latest advancements in AI and machine lea... QualificationsAbility to gather business requirements and translate them into technical solutionsProven experience in developing interactive dashboards and reports using Power BI (3 years minimum)Strong proficiency in SQL and PythonStrong knowledge of DAX (Data Analysis Expressions)Experience working with APIs inside of Power BIExperience with data modeling and data visualization best practicesKnowledge of data warehousing concepts and methodologiesExperience in data analysis and problem-solvingExcellent communication and collaboration skillsBachelor's degree in Computer Science, Information Systems, or a related fieldExperience with cloud platforms such as Azure or AWS is a plus
    HoursApproximately 15 - 20 hours per week for 3 months with the opportunity to extend the contract further
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

ai-job-embedding-finetuning

  • Dataset: ai-job-embedding-finetuning at c86ac36
  • Size: 101 evaluation samples
  • Columns: query, job_description_pos, and job_description_neg
  • Approximate statistics based on the first 101 samples:
    query job_description_pos job_description_neg
    type string string string
    details
    • min: 8 tokens
    • mean: 15.39 tokens
    • max: 29 tokens
    • min: 27 tokens
    • mean: 381.37 tokens
    • max: 512 tokens
    • min: 14 tokens
    • mean: 320.86 tokens
    • max: 512 tokens
  • Samples:
    query job_description_pos job_description_neg
    Data Engineer Snowflake ETL big data processing experience. But most of all, we’re backed by a culture of respect. We embrace authenticity and inspire people to thrive.

    The CDE Data Engineer will join the Content Delivery Engineering team, within the Global Video Engineering organization at NBCUniversal. The CDE Data Engineer will be responsible for implementing and maintaining systems that ingest, process, and store vast amounts of data from internal systems and external partner systems. These data systems must be scalable, robust, and within budget. In this role, the CDE Data Engineer will work with a variety of technologies that support the building of meaningful models, alerts, reports, and visualizations from vast quantities of data.

    Responsibilities Include, But Are Not Limited To

    Development of data systems and pipelinesAssist in cleansing, discretization, imputation, selection, generalization etc. to create high quality features for the modeling processWork with business stakeholders to define business requirements includ...
    Skills Looking For:- The project involves creating a unified data structure for Power BI reporting.- Candidate would work on data architecture and unifying data from various sources.- Data engineering expertise, including data modeling and possibly data architecture.- Proficiency in Python, SQL, and DAX.- Work with AWS data, and data storage.- Experience with cloud platforms like AWS is preferred.- Familiarity with Microsoft Power Automate and Microsoft Fabric is a plus.- Collaborating with users to understand reporting requirements for Power BI. Must be good at using Power BI tools (creating dashboards); excellent Excel skills.- Supply chain background preferred.
    Education and Level of Experience:- Bachelor's degree (quantitative learnings preferred- data analytics, statistics, computer science, math) with 3 to 5 years of experience.- Must have recent and relevant experience.
    Top 3 Skills:- Data engineering, including data modeling and data architecture.- Proficiency in Python, SQL, a...
    GenAI applications, NLP model development, MLOps pipelines experience building enterprise level GenAI applications, designed and developed MLOps pipelines . The ideal candidate should have deep understanding of the NLP field, hands on experience in design and development of NLP models and experience in building LLM-based applications. Excellent written and verbal communication skills with the ability to collaborate effectively with domain experts and IT leadership team is key to be successful in this role. We are looking for candidates with expertise in Python, Pyspark, Pytorch, Langchain, GCP, Web development, Docker, Kubeflow etc. Key requirements and transition plan for the next generation of AI/ML enablement technology, tools, and processes to enable Walmart to efficiently improve performance with scale. Tools/Skills (hands-on experience is must):• Ability to transform designs ground up and lead innovation in system design• Deep understanding of GenAI applications and NLP field• Hands on experience in the design and development of NLP mode... skills, education, experience, and other qualifications.
    Featured Benefits:
    Medical Insurance in compliance with the ACA401(k)Sick leave in compliance with applicable state, federal, and local laws
    Description: Responsible for performing routine and ad-hoc analysis to identify actionable business insights, performance gaps and perform root cause analysis. The Data Analyst will perform in-depth research across a variety of data sources to determine current performance and identify trends and improvement opportunities. Collaborate with leadership and functional business owners as well as other personnel to understand friction points in data that cause unnecessary effort, and recommend gap closure initiatives to policy, process, and system. Qualification: Minimum of three (3) years of experience in data analytics, or working in a data analyst environment.Bachelor’s degree in Data Science, Statistics, Applied Math, Computer Science, Business, or related field of study from an accredited ...
    data engineering ETL cloud platforms data security experience. While operating within the Banks risk appetite, achieves results by consistently identifying, assessing, managing, monitoring, and reporting risks of all types.

    ESSENTIAL DUTIES AND SKILLS, AND ABILITIES REQUIRED:

    Bachelors degree in Computer Science/Information Systems or equivalent combination of education and experience. Must be able to communicate ideas both verbally and in writing to management, business and IT sponsors, and technical resources in language that is appropriate for each group. Fundamental understanding of distributed computing principles Knowledge of application and data security concepts, best practices, and common vulnerabilities. Conceptual understanding of one or more of the following disciplines preferred big data technologies and distributions, metadata management products, commercial ETL tools, Bi and reporting tools, messaging systems, data warehousing, Java (language and run time environment), major version control systems, continuous integra...
    experience.

    We are looking for a highly energetic and collaborative Senior Data Engineer with experience leading enterprise data projects around Business and IT operations. The ideal candidate should be an expert in leading projects in developing and testing data pipelines, data analytics efforts, proactive issue identification and resolution and alerting mechanism using traditional, new and emerging technologies. Excellent written and verbal communication skills and ability to liaise with technologists to executives is key to be successful in this role.
    • Assembling large to complex sets of data that meet non-functional and functional business requirements• Identifying, designing and implementing internal process improvements including re-designing infrastructure for greater scalability, optimizing data delivery, and automating manual processes• Building required infrastructure for optimal extraction, transformation and loading of data from various data sources using GCP/Azure and S...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step ai-job-validation_cosine_accuracy ai-job-test_cosine_accuracy
0 0 0.8812 -
1.0 51 0.9901 1.0

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0
  • PyTorch: 2.5.1
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}