Skip to content
View HyukjinKwon's full-sized avatar

Organizations

@apache @databricks @cloudpipe @conda-forge @spark-korea @data-apis @py4j
Block or Report

Block or report HyukjinKwon

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The goal of this library is to provide a compatibility layer that makes it easier to adopt Spark Connect. The library is designed to be simply imported in your application and will then monkey-patc…

Python 16 1 Updated Aug 5, 2024

Matter (formerly Project CHIP) creates more connections between more objects, simplifying development for manufacturers and increasing compatibility for consumers, guided by the Connectivity Standa…

C++ 7,278 1,939 Updated Aug 12, 2024

English SDK for Apache Spark

Python 829 122 Updated Jun 12, 2024

Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.12. You feed it your Python app, it does a lot of clever things, and spits out an executable or exte…

Python 11,578 632 Updated Aug 12, 2024

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Python 10,810 1,156 Updated Jun 30, 2023

Embed Python in Java

C 1,284 147 Updated Aug 3, 2024

Py4J enables Python programs to dynamically access arbitrary Java objects

Java 1,171 215 Updated Jun 20, 2024

Mirror of https://gitlab.com/zero323/dlt

R 6 Updated Nov 25, 2022

📖 HonKit is building beautiful books using Markdown - Fork of GitBook

TypeScript 2,980 218 Updated Jun 18, 2024

An open protocol for secure data sharing

Scala 739 163 Updated Aug 9, 2024

A directive for including a Plotly figure in a Sphinx document.

Python 13 1 Updated Oct 25, 2022

A library on top of either pex or conda-pack to make your Python code easily available on a cluster

Python 45 21 Updated Apr 26, 2024

Package conda environments for redistribution

Python 508 89 Updated Aug 8, 2024

✅ The missing status check utility for workflow_run action.

TypeScript 29 13 Updated Feb 16, 2023

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

Scala 778 228 Updated Aug 12, 2024

A clean, three-column Sphinx theme with Bootstrap for the PyData community

Python 575 305 Updated Aug 8, 2024

All the things about TPC-DS in Apache Spark

Scala 103 37 Updated Jun 15, 2023

Apache (Py)Spark type annotations (stub files).

Python 114 37 Updated Aug 17, 2022

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,349 1,652 Updated Aug 12, 2024

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Python 42,961 17,684 Updated Aug 12, 2024

Koalas: pandas API on Apache Spark

Python 3,327 356 Updated Mar 20, 2024

NumPy aware dynamic Python compiler using LLVM

Python 9,704 1,113 Updated Aug 12, 2024

A lightweight library to inject LLVM bitcode into JVMs

C++ 81 7 Updated Dec 9, 2019

JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter

Java 1,777 340 Updated Jan 29, 2024

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

Scala 874 599 Updated Aug 12, 2024

Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks

358 80 Updated Jun 6, 2017

Production-Grade Container Scheduling and Management

Go 109,121 39,094 Updated Aug 12, 2024

Awesome list for Paxos and friends

2,028 205 Updated May 29, 2024

A curated list of awesome JSON datasets that don't require authentication.

JavaScript 3,266 376 Updated Jul 23, 2024
Next