Ruiqing's Blog

发表于2024-08-08|lecture|information system•paper writing

CDS Research Paper Template 📝 注意逻辑的完整性和严谨性！！！要不断思考：从introduction提出的contrbution到related work存在的methodological gaps,以及method的key design choices,到evalution的实验设计分别想要验证什么。 Introduction 🎯 Introduce the business or societal problem to be solved and describe its significant impacts on real-world practices, also identify research gaps and highlight methodological innovations or contributions achieved by a CDS study. 1. problem elaboration problem formulation problem motivation / importance, e.g., ju ...

Computational Design Science in Information Systems Research

发表于2024-08-06|lecture|information system•method design

Computational Design Science: A Critical Information Systems Research Area Contributing to AI and Data Science speaker: Prof. Xiao Fang Information Systems (IS) Research IS research centers on the interactions between information technology (IT) and people, organizations, and society that utilize IT to achieve their goals (Lee, 1999 in MISQ). Design science paradigm designs and evaluates novel IT artifacts that enable people, organizations, and society to accomplish their objectives (Hevne ...

LLM 隐私及安全相关调研材料

发表于2023-12-08|paper readingsurvey|LLM security•privacy leakage•finetune

About LLM Security & Privacy 主要问题 LLM 流程中哪些步骤会涉及隐私泄露，涉及的具体隐私信息是什么？针对相关的隐私安全问题对应的技术？ LLM 数据隐私 News about LLM Safety 2023年3月25日，OpenAI发文证实，部分ChatGPT Plus服务订阅用户可能泄露了部分个人隐私和支付信息。 2023年10月30日，美国总统签署行政命令为人工智能的安全、保障和道德应用建立了新的基准，保证充分利用人工智能的潜力并降低人工智能的风险。其中，隐私保护是关注的重点之一。 FACT SHEET: President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence | The White House 2023年11月29日，OpenAI定制聊天机器人GPT正在泄露用户隐私 the initial instructions the chatbots were given when they we ...

Oort:基于引导参与者选择的高效联邦学习

发表于2023-11-26|paper reading|centralized federated learning•client selection

Oort: Efficient Federated Learning via Guided Participant Selection 会议：OSDI’21 (操作系统领域顶会) 文章: osdi21 代码: Github 作者: Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, Mosharaf Chowdhury 机构: University of Michigan 研究背景和动机实际使用联邦学习时，一个基本问题是：选择一个"好"的客户子集作为参与者。每个参与者本地处理自己的数据，中央服务器只收集和汇总它们的结果。现有的方法在随机选择参与者的情况下，从2个方面进行优化： statistical model efficiency（统计模型效率）：在更少的训练轮次下有更高的训练准确率 system efficiency（系统效率）：更少的通信轮次虽然随机选择参与者在部署的时候很方便，但是当设备速度和数据分布存在较大差异时联邦学习的表现很可能不够理想。更糟糕的是，随机选择参与者会导致有偏的数据集，其结果的可信度大大降低。由此，一般大 ...

环境配置和代码总结记录

发表于2023-08-26|code|python environment•torch + cuda

torch + torch_geometric 安装 torch 1.13.1 + cuda 11.7 低于服务器安装的cuda版本即可，nvidia-smi查看 1pip3 install torch==1.13.1+cu117 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117 验证： 12import torchtorch.cuda.is_available() # True torch_geometric https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html 1234pip install torch_geometric# Optional dependencies:pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/ ...

掌控习惯：如何养成好习惯并戒除坏习惯

发表于2023-08-23|book|自我管理

摘抄人们很容易高估某个决定性时刻的重要性，也很容易低估每天进行微小改进的价值。习惯是自我提高的复利。

PyHealth：临床预测建模的深度学习工具箱

发表于2023-08-22|tool|healthcare prediction•toolkit

PyHealth: A Comprehensive Deep Learning Toolkit For Clinical Predictive Modeling 会议/期刊：KDD’23 Tutorial 作者: Chaoqi Yang, Zhenbang Wu, Patrick Jiang, Zhen Lin, Junyi Gao, Benjamin P. Danek, Jimeng Sun 机构: UIUC, UE 主页: PyHealth 代码: Github Tutorial资料: google drive 一个比较完整的项目，有详细的说明文档、示例和视频，很值得学习👍👍 包括各种类型的任务：临床预测建模：readmissiong prediction, length of stay prediction task，mortality prediction, diagnosis-based drug recommendation 生理信号的深度学习：sleep staging, EEG event detection, abnormal EEG detection, ...

GraphCare:用公开的个性化知识图谱增强医疗预测

发表于2023-08-21|paper reading|knowledge graph application•healthcare prediction•LLM application

GraphCare: Enhancing Healthcare Predictions with Open-World Personalized Knowledge Graphs 文章: ArXiv 代码: Github 日期: 2023.05.22 作者: Pengcheng Jiang, Cao Xiao, Adam Cross, Jimeng Sun 机构: UIUC, Relativity, OSF HealthCare 研究背景和动机常见的健康预测任务有：死亡预测，住院时长预测，再次住院预测，药物推荐等。为了提升根据电子病历数据做预测的表现，同时把专家知识和数据洞见结合起来，临床知识图谱（KG）就是一个重要的知识来源。KG中包括医疗概念（如，诊断、治疗、药物等）和它们之间的关联。现有方法的局限：主要关注实体间的层次关系（比如ICD9或者ICD10编码之间的层级关系），没有充分挖掘KG中各种实体间的复杂关系以学习更多上下文的知识；大语言模型（LLMs）已经展现出作为知识库的潜力，可以用作额外的临床知识提取器（还没有相关的尝试）。新的想法：提出了个性化医疗KG的概念 ...